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Summary 



As we dive into the digital era, there is growing concern about the 
amount of personal digital information that is being gathered about us. 
Websites often track people's browsing behavior, health care insurers gather 
medical data, and many smartphones and navigation systems store or trans- 
mit information that makes it possible to track the physical location of 
their users at any time. Hence, anonymity, and privacy in general, are in- 
creasingly at stake. Anonymity protocols counter this concern by offering 
anonymous communication over the Internet. To ensure the correctness of 
such protocols, which are often extremely complex, a rigorous framework is 
needed in which anonymity properties can be expressed, analyzed, and ulti- 
mately verified. Formal methods provide a set of mathematical techniques 
that allow us to rigorously specify and verify anonymity properties. 

This thesis addresses the foundational aspects of formal methods for 
applications in security and in particular in anonymity. More concretely, 
we develop frameworks for the specification of anonymity properties and 
propose algorithms for their verification. Since in practice anonymity pro- 
tocols always leak some information, we focus on quantitative properties 
which capture the amount of information leaked by a protocol. 

We start our research on anonymity from its very foundations, namely 
conditional probabilities - these are the key ingredient of most quantitative 
anonymity properties. In Chapter 2 we present cpCTL, the first temporal 
logic making it possible to specify conditional probabilities. In addition, 
we present an algorithm to verify cpCTL formulas in a model-checking 
fashion. This logic, together with the model-checker, allows us to specify 
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and verify quantitative anonymity properties over complex systems where 
probabilistic and nondeterministic behavior may coexist. 

We then turn our attention to more practical grounds: the constructions 
of algorithms to compute information leakage. More precisely, in Chapter 
3 we present polynomial algorithms to compute the (information-theoretic) 
leakage of several kinds of fully probabilistic protocols (i.e. protocols with- 
out nondeterministic behavior). The techniques presented in this chapter 
are the first ones enabling the computation of (information-theoretic) leak- 
age in interactive protocols. 

In Chapter 4 we attack a well known problem in distributed anonymity 
protocols, namely full-information scheduling. To overcome this problem, 
we propose an alternative definition of schedulers together with several 
new definitions of anonymity (varying according to the attacker's power), 
and revise the famous definition of strong- anonymity from the literature. 
Furthermore, we provide a technique to verify that a distributed protocol 
satisfies some of the proposed definitions. 

In Chapter 5 we provide (counterexample-based) techniques to debug 
complex systems, allowing for the detection of fiaws in security protocols. 
Finally, in Chapter 6 we briefly discuss extensions to the frameworks and 
techniques proposed in Chapters 3 and 4. 
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Chapter 1 

Introduction 



1.1 Anonymity 

The world Anonymity derives from the Greek avojuv^Ca, which means 
"without a name". In general, this term is used to express the fact that 
the identity of an individual is not publicly known. 

Since the beginning of human society, ano- 
nymity has been an important issue. For in- 
stance, people have always felt the need to 
be able to express their opinions without be- 
ing identified, because of the fear of social and 
economical retribution, harassment, or even 
threats to their lives. 

1.1.1 The relevance of anonymity nowadays 

With the advent of the Internet, the issue of anonymity has been magni- 
fied to extreme proportions. On the one hand, the Internet increases the 
opportunities of interacting online, communicating information, expressing 
opinion in public forums, etc. On the other hand, by using the Internet 
we are disclosing information about ourselves: every time we visit a web- 
site certain data about us may be recorded. In this way, organizations 
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like multinational corporations can build a permanent, commercially valu- 
able record of our interests. Similarly, every email we send goes through 
multiple control points and it is most likely scanned by profiling software 
belonging to organizations like the National Security Agency of the USA. 
Such information can be used against us, ranging from slightly annoying 
practices like commercial spam, to more serious offences like stealing credit 
cards' information for criminal purposes. 

Anonymity, however, is not limited to individual issues: it has consider- 
able social and political implications. In countries controlled by repressive 
governments, the Internet is becoming increasingly more restricted, with 
the purpose of preventing their citizens from accessing uncensored infor- 
mation and from sending information to the outside world. The role of 
anonymizing technologies in this scenario is twofold: (1) they can help 
accessing sources of censored information via proxies (2) they can help in- 
dividuals to freely express their ideas (for instance via online forums). 

The practice of censoring the Internet is actually not limited to re- 
pressive governments. In fact, a recent research project conducted by the 
universities of Harvard, Cambridge, Oxford and Toronto, studied govern- 
ment censorship in 46 countries and concluded that 25 of them, including 
various western countries, filter to some extent communications concerning 
political or religious positions. 

Anonymizing technologies, as most technologies, can also be used for 
malicious purposes. For instance, they can be used to help harassment, hate 
speech, financial scams, disclosure of private information, etc. Because of 
their nature, they are actually more controversial than other technologies: 
people are concerned that terrorists, pedophiles, or other criminals could 
take advantage of them. 

Whatever is the use one can make of anonymity, and the personal view 
one may have on this topic, it is clearly important to be able to assess the 
degree of anonymity of a given system. This is one of the aims of this thesis. 
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1.1.2 Anonymizing technologies nowadays 

The most common use of anonymizing technologies is to prevent observers 
from discovering the source of communications. 

This is not an easy task, since in general users 
must include in the message information about 
themselves. In practice, for Internet communi- 
cation, this information is the (unique) IP ad- 
dress of the computer in use, which specifies its 
location in the topology of the network. This 
IP number is usually logged along with the host 
name (logical name of the sender). Even when 
the user connects to the Internet with a tempo- 
rary IP number assigned to him for a single ses- 
sion, this number is in general logged by the ISP 
(Internet Service Provider), which makes it pos- 
sible, with the ISP's collaboration, to know who used a certain IP number 
at a certain time and thus to find out the identity of the user. 

The currently available anonymity tools aim at preventing the observers 
of an online communication from learning the IP address of the participants. 
Most applications rely on proxies, i.e. intermediary computers to which 
messages are forwarded and which appear then as senders of the communi- 
cation, thus hiding the original initiator of the communication. Setting up a 
proxy server nowadays is easy to implement and maintain. However, single- 
hop architectures in which all users enter and leave through the same proxy, 
create a single point of failure which can significantly threaten the security 
of the network. Multi-hop architectures have therefore been developed to 
increase the performance as well as the security of the system. In the so- 
called daisy-chaining anonymization for instance, traffic hops deliberately 
via a series of participating nodes (changed for every new communication) 
before reaching the intended receiver, which prevents any single entity from 
identifying the user. Anonymouse [Ans], FilterSneak [Fil] and Proxify [Pro] 
are well-known free web based proxies, while Anonymizer [Ane] is currently 
one of the leading commercial solutions. 
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1.1.3 Anonymizing technologies: a bit of history 

Anonymous posting/reply services on the Internet were started around 1988 
and were introduced primarily for use on specific newsgroups which dis- 
cussed particularly volatile, sensitive and personal subjects. In 1992, ano- 
nymity services using remailers were originated by Cypherpunk. Global 
anonymity servers which served the entire Internet soon sprang up, com- 
bining the functions of anonymous posting as well as anonymous remailing 
in one service. The new global services also introduced the concept of 
pseudonymity which allowed anonymous mail to be replied. 

The first popular anonymizing tool was the Penet remailer developed 
by Johan Helsingius of Finland in the early 1990s. The tool was originally 
intended to serve only Scandinavia but Helsingius eventually expanded to 
worldwide service due to a flood of international requests. 

Based on this tool, in 1995, Mikael Berglund made a study on how 
anonymity was used. His study was based on scanning all publicly available 
newsgroups in a Swedish Usenet News server. He randomly selected a 
number of messages from users of the Penet remailer and classified them 
by topic. His results are shown in Table 1.1. 

In 1993, Cottrell wrote the Mixmaster remailer and two years later he 
launched Anonymizer which became the first Web-based anonymity tool. 

1.1.4 Anonymity and computer science 

The role of computer science with respect to anonymity is twofold. On one 
the hand, the theory of communication helps in the design and implemen- 
tation of anonymizing protocols. On the other hand, like for all software 
systems, there is the issue of correctness, i.e., of ensuring that the protocol 
achieves the expected anonymity guarantees. 

While most of the work on anonymity in the literature belongs to the 
first challenge, this thesis addresses the second one. Ensuring the correct- 
ness of a protocol involves (1) the use of formalisms to precisely model the 
behaviour of the protocol, and (2) the use of formalisms to specify unam- 
biguously the desired properties. Once the protocol and its desired prop- 
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Percentage 


Type of message 


30,0 % 


Discussion 






Common topics: Sex, hobby, work, religion, 
politics, ethics, software. 


23,1 % 


Adveitlsements 






Common topics: Sexual/romantic contact 
advertisements dominated, a few other 
advertisements also used anonymity, for 
example ads searching for friends with a 
particular Interest. The authors of contact 
ads were mostly male 


16,5 % 


Questions and answers 






Common topics: Computer software Issues, 
sex, medicine and drugs. 


13,2 Vo 


Texts 






Common topics: Pornographic texts, about 
50 % heterosexual and 50 % homosexual 
(purported to be whtten by both men and 
women). Jokes, sometimes nasty. 


9,9 % 


Test messages 




|To try out if the anon/mlty server works. 


3,7 % 


Pictures 




|Mostly erotic/pornographic. 


0,4 % 


Computer software 


3,3 % 


Unclasslflable 






Written In a language the researcher could 
not read, such as several messages In 
Chinese. Note the repressive political regime 
In China, which may be a reason why there 
were several people who needed to use an 
anonymity server In discussing Issues in that 
language. 



Figure 1.1: Statistics on the Use of Anonymity - Penet 

erties have been specified, it is possible to employ verification techniques 
to prove formally that the specified model satisfy such properties. These 
topics belong to the branch of computer science called formal methods. 

1.2 Formal methods 

Formal methods are a particular kind of mathematically-based techniques 
used in computer science and software engineering for the specification and 
verification of software and hardware systems. These techniques have their 
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foundations on the most diverse conceptual frameworks: logic calculi, au- 
tomata theory, formal languages, program semantics, etc. 

1.2.1 The need of formal verification 

As explained in previous sections, internet technologies play an important 
role in our lives. However, Internet is not the only kind of technology we 
are in contact with: Every day we interact with embedded systems such 
as mobile phones, smart cards, GPS receivers, videogame consoles, digital 
cameras, DVD players, etc. Technology also plays an important role in 
critical-life systems, i.e., systems where the malfunction of any component 
may incur in life losses. Example of such systems can be found in the areas 
of medicine, aeronautics, nuclear energy generation, etc. 

The malfunction of a technological device can have important negative 
consequences ranging from material to life loss. In the following we list 
some famous examples of disasters caused by software failure. 

Material loss: In 2004, the Air Traf- 
fic Control Center of Los Angeles Inter- 
national Airport lost communication with 
Airplanes causing the immediate suspen- 
sion of all operations. The failure in the 
radio system was due to a 32-bit countdown 
timer that decremented every millisecond. 
Due to a bug in the software, when the 
counter reached zero the system shut down 
unexpectedly. This communication outage disrupted about 600 flights (in- 
cluding 150 cancellations) impacting over 30.000 passengers and causing 
millionaire losses to airway companies involved. 

In 1996, an Ariane 5 rocket launched by the European Space Agency 
exploded just forty seconds after lift-off. The rocket was on its first voyage, 
after a decade of development costing U$S 7 billion. The destroyed rocket 
and its cargo were valued at U$S 500 million. A board of inquiry inves- 
tigated the causes of the explosion and in two weeks issued a report. It 
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turned out that the cause of the failure was a software error in the inertial 
reference system. Specifically a 64 bit floating point number related to the 
horizontal velocity of the rocket was converted to a 16 bit signed integer. 

In the early nineties a bug (discovered by a professor of Lynchburg Col- 
lege, USA) in the floating-point division unit of the processor Intel Pentium 
II not only severely damaged Intel's reputation, but it also forced the re- 
placement of faulty processors causing a loss of 475 million US dollars for 
the company. 



Fatal loss: A software flaw in the con- 
trol part of the radiation therapy machine 
Therac-25 caused the death of six cancer 
patients between 1985 and 1987 as they 
were exposed to an overdose of radiation. 

In 1995 the American Airlines Flight 
965 connecting Miami and Call crashed 
just five minutes before its scheduled ar- 
rival. The accident led to a total of 159 
deaths. Paris Kanellakis, a well known re- 
searcher (creator of the partition refine- 
ment algorithm, broadly used to verify 
bisimulation) , was in the flight together with his family. Investigations 
concluded that the accident was originated by a sudden turn of the aircraft 
caused by the autopilot after an instruction of one of the pilots: the pilot 
input 'R' in the navigational computer referring to a location called 'Rozo' 
but the computer erroneously interpreted it as a location called 'Romeo' 
(due to the spelling similarity and physical proximity of the locations). 

As the use and complexity of technological devices grow quickly, mech- 
anisms to improve their correctness have become unavoidable. But, how 
can we be sure of the correctness of such technologies, with thousands (and 
sometimes, millions) of components interacting in complex ways? One pos- 
sible answer is by using formal verification, a branch of formal methods. 
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1.2.2 Formal verification 

Formal verification is considered a fundamental area of study in computer 
science. In the context of hardware and software systems, formal verifica- 
tion is the act of proving or disproving the correctness of the system with 
respect to a certain property, using formal methods. In order to achieve 
this, it is necessary to construct a mathematical model describing all pos- 
sible behaviors of the system. In addition, the property must be formally 
specified avoiding, in this way, possible ambiguities. 

Important formal verification techniques include theorem proving, sim- 
ulation, testing, and model checking. In this thesis we focus on the use of 
this last technique. 

Model checking Model checking is an automated verification technique 
that, given a finite model of the system and a formal property, systemati- 
cally checks whether the property holds in the model or not. In addition, 
if the property is falsified, debugging information is provided in the form 
of a counterexample. This situation is represented in Figure 1.3. 

Usual properties that can be verified are "Can the system reach a dead- 
lock state?", or "Every sent message is received with probability at least 
0.99?". Such automated verification is carried on by a so-called model 
checker, an algorithm that exhaustively searches the space state of the 
model looking for states violating the (correctness) property. 

A major strength of model checking is the capability of generating 
counterexamples which provide diagnostic information in case the prop- 
erty is violated. Edmund M. Clarke, one of the pioneers of Model Check- 
ing said [Cla08]: "It is impossible to overestimate the importance of the 
counterexample feature. The counterexamples are invaluable in debugging 
complex systems. Some people use model checking just for this feature". In 
case a state violating the property under consideration is encountered, the 
model checker provides a counterexample describing a possible execution 
that leads from the initial state of the system to a violating state. 

Other important advantages of model checking are: it is highly au- 
tomatic so it requires little interaction and knowledge of designers, it is 
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Figure 1.3: Schematic view of model-checking approach 



rather fast, it can be apphed to a large range of problems, it allows partial 
specifications. 

The main disadvantage of model checking is that the space state of cer- 
tain systems, for instance distributed systems, can be rather large, thus 
making the verifications inefficient and in some cases even unfeasible (be- 
cause of memory limitations). This problem is known as the state explosion 
problem. Many techniques to alleviate it have been proposed since the in- 
vention of model checking. Among the most popular ones we mention the 
use Binary Decision Diagrams (BDDs), partial order reduction, abstrac- 
tion, compositional reasoning, and symmetry reduction. State-of-the-art 
model checkers can easily handle up to 10^ states with explicit state rep- 
resentation. For certain specific problems, more dedicated data structures 
(like BDDs) can be used thus making it possible to handle even up to 10^™ 
states. 

The popularity of model checking has grown considerably since its in- 
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vention at the beginning of the 80s. Nowadays, model checking techniques 
are employed by most or all leading hardware companies (e.g. INTEL, IBM 
and MOTOROLA - just to mention few of them). While model checking 
is applied less frequently by software developing companies, there have 
been several cases in which it has helped to detect previously unknown 
defects in real-world software. A prominent example is the result of re- 
search in Microsoft's SLAM project in which several formal techniques 
were used to automatically detect flaws in device drivers. In 2006, Mi- 
crosoft released the Static Driver Verifier as part of Windows Vista, SDV 
uses the SLAM software-model-checking engine to detect cases in which de- 
vice drivers linked to Vista violate one of a set of interface rules. Thus SDV 
helps uncover defects in device drivers, a primary source of software bugs 
in Microsoft applications. Investigations have shown that model checking 
procedures would have revealed the exposed defects in, e.g., Intels Pentium 
II processor and the Therac-25 therapy radiation machine. 

Focus of this thesis This thesis addresses the foundational aspects of 
formal methods for applications in security and in particular in anonymity: 
We investigate various issues that have arisen in the area of anonymity, we 
develop frameworks for the specification of anonymity properties, and we 
propose algorithms for their verification. 

1.3 Background 

In this section we give a brief overview of the various approaches to the 
foundations of anonymity that have been explored in the literature. We 
will focus on anonymity properties, although the concepts and techniques 
developed for anonymity apply to a large extent also to neighbor topics 
like information flow, secrecy, privacy. The common denominator of these 
problems is the prevention of the leakage of information. More precisely, 
we are concerned with situations in which there are certain values (data, 
identities, actions, etc) that are intended to be secret, and we want to 
ensure that an adversary will not be able to infer the secret values from 



1.3. Background 



11 



the information which is publicly available. Some researchers use the term 
information hiding to refer to this class of problems [HO05]. 

The frameworks for reasoning about anonymity can be classified into 
two main categories: the possibilistic approaches, and the probabilistic (or 
quantitative) ones. 

Possibilistic notions 

The term "possibilistic" refers to the fact that we do not consider quan- 
titative aspects. More precisely, anonymity is formulated in terms of the 
possibility or inferring some secrets, without worrying about "how likely" 
this is, or "how much" we narrow down the secret. 

These approaches have been widely explored in the literature, using 
different conceptual frameworks. Examples include the proposals based 
on epistemic logic ([SS99, HO05]), on "function views" ([HS04]), and on 
process equivalences (see for instance [SS96, RSOl]). In the following we 
will focus on the latter kind. 

In general, possibilistic anonymity means that the observables do not 
identify a unique culprit. Often this property relies on nondeterminism: 
for each culprit, the system should be able to produce alternative executions 
with different observables, so that in turn for a given observable there are 
many agents that could be the culprit. More precisely, in its strongest 
version this property can be expressed as follows: if in one computation 
the identity of the culprit is i and the observable outcome is o, then for 
every other agent j there must be a computation where, with culprit j, the 
observable is still a. 

This kind of approach can be applied also in case of systems that use ran- 
domization. The way this is done is by abstracting the probabilistic choices 
into nondeterministic ones. See for example the Dining Cryptographers ex- 
ample in [SS96], where the coin tossing is represented by a nondeterministic 
process. 

In general the possibilistic approaches have the advantages of simplicity 
an efficiency. On the negative side, they lack precision, and in some cases 
the approximation can be rather loose. This is because every scenario that 
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has a non-null probability is interpreted as possible. For instance, consider 
the case in which a system reveals the culprit 90 percent of the times by 
outputting his identity, while in the remaining 10 percent of the times it 
outputs the name of some other agent. The system would not look very 
anonymous. Yet, the possibilistic definition of anonymity would be satisfied 
because all users would appear as possible culprits to the observer regardless 
of the output of the system. In general, in the possibilistic approach the 
strongest notion of anonymity we can express is possible innocence, which 
is satisfied when no agent appear to be the culprit for sure: there is always 
the possibility that he is innocent (no matter how unlikely it is). 

In this thesis we consider only the probabilistic approaches. Their com- 
mon feature is that they deal with probabilities in a concrete way and they 
are, therefore, much more precise. They have become very popular in re- 
cent times, and there has been a lot of work dedicated to understanding 
and formalizing the notion in a rigorous way. In the next section we give a 
brief overview of these efforts. 

Probabilistic notions 

These approaches take probabilities into account, and are based on the 
likelihood that an agent is the culprit, for a given observable. One notion 
of probabilistic anonymity which has been thoroughly investigated in the 
literature is strong anonymity. 

Strong anonymity Intuitively the idea behind this notion is that the ob- 
servables should not allow to infer any (quantitative) information about the 
identity of the culprit. The corresponding notion in the field of information 
flow is (quantitative) non-interference. 

Once we try to formalize more precisely the above notion we discover 
however that there are various possibilities. Correspondingly, there have 
been various proposals. We recall here the three most prominent ones. 

1. Equality of the a posteriori probabilities for different culprits. The 
idea is to consider a system strongly anonymous if, given an observ- 
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able o, the a posteriori probability that the identity of the culprit 
is z, P(i|o), is the same as the a posteriori probability of any other 
identity j. Formally: 

P(i|o) = P(jjo) for all observables o, and all identities i and j (1.1) 

This is the spirit of the definition of strong anonymity by Halpern and 
O'Neill [HO05], although their formalization involves more sophisti- 
cated epistemic notions. 

2. Equality of the a posteriori and a priori probabilities for the same 
culprit. Here the idea is to consider a system strongly anonymous 
if, for any observable, the a posteriori probability that the culprit 
is a certain agent i is the same as its a priori probability. In other 
words, the observation does not increase or decrease the support for 
suspecting a certain agent. Formally: 

P(i|o) = P(i) for all observables o, and all identities i (1-2) 

This is the definition of anonymity adopted by Chaum in [Clia88] . He 
also proved that the Dining Cryptographers satisfy this property if 
the coins are fair. Halpern and O'Neill consider a similar property in 
their epistemological setting, and they call it conditional anonymity 
[HO05]. 

3. Equality of the likelihood of different culprits. In this third definition 
a system is strongly anonymous if, for any observable o and agent i, 
the likelihood of i being the culprit, namely P(o|i), is the same as the 
likelihood of any other agent j. Formally: 

P(o|i) = P(o|j) for all observables o, and all identities i and j (1.3) 

This was proposed as definition of strong anonymity by Bhargava and 
Palamidessi [BP05]. 

In [BCPP08] it has been proved that definitions (1.2) and (1.3) are 
equivalent. Definition (1.3) has the advantage that it does extend in a nat- 
ural way to the case in which the choice of the culprit is nondeterministic. 
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This could be useful when we do not know the a priori distribution of the 
culprit, or when we want to abstract from it (for instance because we are 
interested in the worst case). 

Concerning Definition (1.1), it probably looks at first sight the most 
natural, but it actually turns out to be way too strong. In fact it is equiv- 
alent to (1.2) and (1.3), plus the following condition: 

P(i) = P(j) for ah identities i and j (1.4) 

namely the condition that the a priori distribution be uniform. 

It is interesting to notice that (1.1) can be split in two orthogonal prop- 
erties: (1.3), which depends only in the protocol, and (1.4), which depends 
only in the distribution on the secrets. 

Unfortunately all the strong anonymity properties discussed above are 
too strong, almost never achievable in practice. Hence researches have 
started exploring weaker notions. One of the most renowned properties of 
this kind (among the "simple" ones based on conditional probabilities) is 
that of probable innocence. 

Probable innocence The notion of probable innocence was formulated 
by Rubin and Reiter in the context of their work on the Crowds protocol 
[RR98]. Intuitively the idea is that, after the observation, no agent is more 
likely to be the culprit than not to be. Formally: 

P(f|o) < P(-ii|o) for all observations o, and all identities i 

or equivalently 

IP(^|o) < — for all observations o, and all identities i 

In [RR98] Rubin and Reiter proved that the Crowds protocol satisfies prob- 
able innocence under a certain assumption on the number of attackers rel- 
atively to the number of honest users. 

All the notions discussed above are rather coarse, in the sense that they 
are cut-off notions and do not allow to represent small variations in the 
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degree of anonymity. In order to be able to compare different protocols in 
a more precise way, researcher have started exploring settings to measure 
the degree of anonymity. The most popular of these approaches are those 
based in information theory. 

Information theory 

The underlying idea is that anonymity systems are interpreted as channels 
in the information-theoretic sense. The input values are the possible iden- 
tities of the culprit, which, associated to a probability distribution, form a 
random variable Id. The outputs are the observables, and the transition 
matrix consists of the conditional probabilities of the form P(o|i), repre- 
senting the probability that the system produces an observable o when the 
culprit is i. A central notion here is the Shannon entropy, which represents 
the uncertainty of a random variable. For the culprit's possible identity, 
this is given by: 

H{Id) = — ^^P(z) logP(i) (uncertainty a priori) 

i 

Note that Id and the matrix also determine a probability distribution on 
the observables, which can then be seen as another random variable Oh. 
The conditional entropy H{Id\Ob), representing the uncertainty about the 
identity of the culprit after the observation, is given by 

H{Id\Ob) = — ^^P(o) ^^P(i|o) logP(i|o) (uncertainty a posteriori) 

o i 

It can be shown that < H{Id\Ob) < H{Id). We have H{Id\Oh) = 
when there is no uncertainty left about Id after the value of Oh is known. 
Namely, when the value of Oh completely determines the value of Id. This 
is the case of maximum leakage. At the other extreme, we have H{Id\Ob) = 
H[Id) when Oh gives no information about Id, i.e. when Oh and Id are 
independent. 

The difference between H{Id) and II{Id\Oh) is called mutual informa- 
tion and it is denoted by I{Id; Oh): 

I{Id; Oh) = H{Id) - H{Id\Oh) 
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The maximum mutual information between Id and Ob over all possible 
input distributions is known as the channel's capacity: 

C = max I( Id; Ob) 

In the case of anonymity, the mutual information represents the dif- 
ference between the a priori and the a posteriori uncertainties about the 
identity of the culprit. It can therefore be considered as the leakage of 
information due to the system, i.e. the amount of anonymity which is lost 
because of the observables produced by the system. Similarly, the capacity 
represents the worst-case leakage under all possible distributions on the cul- 
prit's possible identities. It can be shown that the capacity is if and only 
if the rows of the matrix are pairwise identical. This corresponds exactly 
to the version (1.3) of strong anonymity. 

This view of the degree of anonymity has been advocated in various 
works, including [MNCM03, MNS03, ZB05, CPPOSa]. In the context of 
information flow, the same view of leakage in information theoretic terms 
has been widely investigated as well. Without pretending to be exhaustive, 
we mention [McL90, Gra91, CHMOl, CHM05a, Low02, Bor06]. 

In [Smi09] Smith has investigated the use of an alternative notion of 
entropy, namely Renyi's min entropy [RenGO], and has proposed to define 
leakage as the analogous of mutual information in the setting of Renyi's 
min entropy. The justification for proposing this variant is that it repre- 
sents better certain attacks called one-try attacks. In general, as Kopf and 
Basin illustrate in their cornerstone paper [KB07], one can use the above 
information-theoretic approach with many different notions of entropy, each 
representing a different model of attacker, and a different way of measuring 
the success of an attack. 

A different information-theoretic approach to leakage has been proposed 
in [CMS09]: in that paper, the authors define as information leakage the 
difference between the a priori accuracy of the guess of the attacker, and the 
a posteriori one, after the attacker has made his observation. The accuracy 
of the guess is defined as the Kullback-Leibler distance between the belief 
(which is a weight attributed by the attacker to each input hypothesis) and 
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the true distribution on the hypotheses. In [HSPIO] a Renyi's min entropy 
variant of this approach has been explored as well. 

We conclude this section by remarking that, in all the approaches dis- 
cussed above, the notion of conditional probability plays a central role. 

1.4 Contribution and plan of the thesis 

We have seen in Section 1.3 that conditional probabilities are the key ingre- 
dients of all quantitative definitions of anonymity. It is therefore desirable 
to develop techniques to analyze and compute such probabilities. 

Our first contribution is cpCTL, a temporal logic allowing us to specify 
properties concerned with conditional probabilities in systems combining 
probabilistic and nondeterministic behavior. This is presented in Chapter 
2. cpCTL is essentially pCTL (probabilistic Computational Tree Logic) 
[HJ94] enriched with formulas of the kind P<a[i;A|'0], stating that the proba- 
bility of (j) given -0 is at most a. We do so by enriching pCTL with formulas 
of the form P|>c]a[(/)|'0]. We propose a model checker for cpCTL. Its de- 
sign has been quite challenging, due to the fact that the standard model 
checking algorithms for pCTL in MDPs (Markov Decision Processes) do 
not extend to conditional probability formulas. More precisely, in contrast 
to pCTL, verifying a conditional probability cannot be reduced to a linear 
optimization problem. A related point is that, in contrast to pCTL, the 
optimal probabilities are not attained by history independent schedulers. 
We attack the problem by proposing the notion of semi history indepen- 
dent schedulers, and we show that these schedulers do attain optimality 
with respect to the conditional probabilities. Surprisingly, it turns out that 
we can further restrict to deterministic schedulers, and still attain optimal- 
ity. Based on these results, we show that it is decidable whether a cpCTL 
formula is satisfied in a MDP, and we provide an algorithm for it. In ad- 
dition, we define the notion of counterexample for the logic and sketch an 
algorithm for counterexample generation. 

Unfortunately, the verification of conditional cpCTL formulae is not ef- 
ficient in the presence of nondeterminism. Another issue, related to nonde- 
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terminism within the apphcations in the field of security, is the well known 
problem of almighty schedulers (see Chapter 4). Such schedulers have the 
(unrealistic) ability to peek on internal secrets of the component and make 
their scheduling policy dependent on these secrets, thus leaking the secrets 
to external observers. We address these problems in separate chapters. 

In Chapter 3 we restrict the framework to purely probabilistic models 
where secrets and observables do not interact, and we consider the prob- 
lem of computing the leakage and the maximal leakage in the information- 
theoretic approach. These are defined as mutual information and capacity, 
respectively. We address these notions with respect to both the Shannon 
entropy and the Renyi min entropy. We provide techniques to compute 
channel matrices in 0((o x q)^) time, where o is the number of observables, 
and q the number of states. (From the channel matrices, we can compute 
mutual information and capacity using standard techniques.) We also show 
that, contrarily to what was stated in literature, the standard information 
theoretical approaches to leakage do not extend to the case in which secrets 
and observable interact. 

In Chapter 4 we consider the problem of the almighty schedulers. We 
define a restricted family of schedulers (admissible schedulers) which can- 
not base their decisions on secrets, thus providing more realistic notions 
of strong anonymity than arbitrary schedulers. We provide a framework 
to represent concurrent systems composed by purely probabilistic compo- 
nents. At the global level we still have nondeterminism, due to the various 
possible ways the component may interact with each other. Schedulers 
are then defined as devices that select at every point of the computation 
the component(s) moving next. Admissible schedulers make this choice 
independently from the values of the secrets. In addition, we provide a 
sufficient (but not necessary) technique based on automorphisms to prove 
strong anonymity for this family of schedulers. 

The notion of counterexample has been approached indirectly in Chap- 
ters 2 and 3. In Chapter 5 we come back and fully focus on this topic. We 
propose a novel technique to generate counterexamples for model checking 
on Markov Chains. Our propose is to group together violating paths that 
are likely to provide similar debugging information thus alleviating the de- 
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bugging tasks. We do so by using strongly connected component analysis 
and show that it is possible to extend these techniques to Markov Decision 
Processes. 

Chapter 6 is an overview chapter. There we briefly describe extensions 
to the frameworks presented in Chapters 3 and 4^. First, we consider the 
case in which secrets and observables interact, and show that it is still pos- 
sible to define an information-theoretic notion of leakage, provided that we 
consider a more complex notion of channel, known in literature as channel 
with memory and feedback. Second, we extend the systems proposed in 
Chapter 4 by allowing nondeterminism also internally to the components. 
Correspondingly, we define a richer notion of admissible scheduler and we 
use it for defining notion of process equivalences relating to nondeterminism 
in a more fiexible way than the standard ones in the literature. In particu- 
lar, we use these equivalences for defining notions of anonymity robust with 
respect to implementation refinement. 

In Figure 1.4 we describe the relation between the different chapters of 
the thesis. Chapter 5 is not explicitly depicted in the figure because it does 
not fit in any of the branches of cpCTL (efficiency - security foundations) . 
However, the techniques developed in Chapter 5 have been applied to the 
works in both Chapters 2 and 3. 

We conclude this thesis In Chapter 7, there we present a summary of 
our main contributions and discuss further directions. 

1.5 Origins of the Chapters and Credits 

In the following we list, for each chapter, the set of related articles together 
with their publication venue and corresponding co-authors. 

• Chapter 2 is mainly based on the article [AvR08] by Peter van Rossum 
and myself. The article was presented in TACAS 2008. In addition, 
this chapter contains material of an extended version of [AvROS] that 
is being prepared for submission to a journal. 

^For more information about tiie topics discussed in this chapter we refer the reader 
to [AAPlOa, AAPll, AAPlOb, AAPvRlO]. 
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• Chapter 3 is based on the article [APvRSlOa] by Catuscia Palamidessi, 
Peter van Rossum, Geoffrey Smith and myself. The article was pre- 
sented in TACAS 2010. 

• Chapter 4 is based on 

— The article [APvRSlOb] by Catuscia Palamidessi, Peter van Rossum, 
Ana Sokolova and myself. This article was presented in QEST 
2010. 

— The journal article [APvRSll] by the same authors. 

• Chapter 5 is based on the article [ADvROS] by Pedro D'Argenio, Peter 
van Rossum, and myself. The article was presented in HVC 2008. 

• Chapter 6 is based on 

— The article [AAPlOb] by Mario S. Alvim, Catuscia Palamidessi, 
and myself. This work was presented in LICS 2010 as part of an 
invited talk by Catuscia Palamidessi. 

— The article [AAPlOa] by Mario S. Alvim, Catuscia Palamidessi, 
and myself. This work presented in CONCUR 2010. 

— The journal article [AAPll] by the same authors of the previous 
works. 
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— The article [AAPvRlO] by Mario S. Alvim, Catuscia Palamidessi, 
Peter van Rossum, and myself. This work was presented in IFIP- 
TCS 2010. 

The chapters remain close to their published versions, thus there is 
inevitably some overlapping between them (in particular in their introduc- 
tions where basic notions are explained). 

A short note about authorship: I am the first author in all the articles 
and journal works included in this thesis with the exception of the ones 
presented in Chapter 6. 
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Conditional Probabilities 
over Probabilistic and 
Nondeterministic Systems 

In this chapter we introduce cpCTL, a logic which extends the 
probabilistic temporal logic pCTL with conditional probabilities 
allowing to express statements of the form "the probability of (j) 
given ip is at most a ". We interpret cpCTL over Markov Chains 
and Markov Decision Processes. While model checking cpCTL 
over Markov Chains can be done with existing techniques, those 
techniques do not carry over to Markov Decision Processes. We 
study the class of schedulers that suffice to find the maximum 
and minimum conditional probabilities, show that the problem 
is decidable for Markov Decision Processes and propose a model 
checking algorithm. Finally, we present the notion of counterex- 
amples for cpCTL model checking and provide a method for 
counterexample generation. 
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2.1 Introduction 

Conditional probabilities are a fundamental concept in probability theory. 
In system validation these appear for instance in anonymity, risk assess- 
ment, and diagnosability. Typical examples here are: the probability that 
a certain message was sent by Alice, given that an intruder observes a cer- 
tain traffic pattern; the probability that the dykes break, given that it rains 
heavily; the probability that component A has failed, given error message 
E. 

In this chapter we introduce cpCTL (conditional probabilistic CTL), a 
logic which extends strictly the probabilistic temporal logic pCTL [HJ89] 
with new probabilistic operators of the form P<^[(^|^]. Such formula means 
that the probability of 4> given tp is at most a. We interpret cpCTL formulas 
over Markov Chains (MCs) and Markov Decision Processes (MDPs). Model 
checking cpCTL over MCs can be done with model checking techniques for 
pCTL*, using the equality P[(/)|V'] = P[(/> A 

In the case of MDPs, cpCTL model checking is significantly more com- 
plex. Writing P^[(/)|'0] for the probability P[(/'|'i/'] under scheduler 77, model 
checking P<^[(/)|'i/)] reduces to computing P"'"[(/)|?/'] = max^ P^[(/)|'0] = max^ 
P^ [(/) A ■0] /IPj; [■0] • Thus, we have to maximize a non-linear function. (Note 
that in general P"''[0|'0] 7^ P'^[(/) A '0]/P'^[V']-) Therefore, we cannot reuse 
the efficient techniques for pCTL model checking, since they heavily rely 
on linear optimization techniques [BdA95]. 

In particular we show that, differently from what happens in pCTL 
[BdA95], history independent schedulers are not sufficient for optimizing 
conditional reachability properties. This is because in cpCTL the opti- 
mizing schedulers are not determined by the local structure of the system. 
That is, the choices made by the scheduler in one branch may influence the 
optimal choices in other branches. We introduce the class of semi history- 
independent schedulers and show that these suffice to attain the optimal 
conditional probability. Moreover, deterministic schedulers still suffice to 
attain the optimal conditional probability. This is surprising since many 
non-linear optimization problems attain their optimal value in the interior 
of a convex polytope, which correspond to randomized schedulers in our 
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setting. 

Based on these properties, we present an (exponential) algorithm for 
checking whether a given system satisfies a formula in the logic. Further- 
more, we define the notion of counterexamples for cpCTL model checking 
and provide a method for counterexample generation. 

To the best of our knowledge, our proposal is the first temporal logic 
dealing with conditional probabilities. 

Applications 

Complex Systems. One application of the techniques presented in this 
chapter is in the area of complex system behavior. We can model the 
probability distribution of natural events as probabilistic choices, and the 
operator choices as non-deterministic choices. The computation of max- 
imum and minimum conditional probabilities can then help to optimize 
run-time behavior. For instance, suppose that the desired behavior of the 
system is expressed as a pCTL formula (p and that during run-time we are 
making an observation about the system, expressed as a pCTL formula 
The techniques developed in this chapter allow us to compute the maxi- 
mum probability of (p given ■0 and to identify the actions (non-deterministic 
choices) that have to be taken to achieve this probability. 

Anonymizing Protocols. Another application is in the area of anonymiz- 
ing protocols. The purpose of these protocols is to hide the identity of the 
user performing a certain action. Such a user is usually called the culprit. 
Examples of these protocols are Onion Routing [CL05] , Dining Cryptogra- 
phers [Cha88], Crowds [RR98] and voting protocols [F0092] (just to men- 
tion a few). Strong anonymity is commonly formulated [ChaSS, BP05] in 
terms of conditional probability: A protocol is considered strongly anony- 
mous if no information about the culprit's identity can be inferred from the 
behavior of the system. Formally, this is expressed by saying that culprit's 
identity and the observations, seen as random variables, are independent 
from each other. That is to say, for all users u and all observations of the 
adversary o: 
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P [culprit 



u I observation = o] = P [culprit = n]. 



If considering a concurrent setting, it is customary to give the adver- 
sary full control over the network [DY83] and model its capabilities as 
nondeterministic choices in the system, while the user behavior and the 
random choices in the protocol are modeled as probabilistic choices. Since 
anonymity should be guaranteed for all possible attacks of the adversary, 
the above equality should hold for all schedulers. That is: the system is 
strongly anonymous if for all schedulers r/, all users u and all adversarial 
observations o: 



Since the techniques in this chapter allow us to compute the maximal and 
minimal conditional probabilities over all schedulers, we can use them to 
prove strong anonymity in presence of nondeterminism. 

Similarly, probable innocence means that a user is not more likely to 
be innocent than not to be (where "innocent" mans "not the culprit"). In 
cpCTL this can be expressed as F<q 5 [culprit = u \ observations = o]. 

Organization of the chapter In Section 2.2 we present the necessary 
background on MDPs. In Section 2.3 we introduce conditional probabilities 
over MDPs and in Section 2.4 we introduce cpCTL. Section 2.5 introduces 
the class of semi history- independent schedulers and Section 2.6 explains 
how to compute the maximum and minimum conditional probabilities. Fi- 
nally, Section 2.7, we investigate the notion of counterexamples. 

2.2 Markov Decision Processes 

Markov Decision Processes constitute a formalism that combines nondeter- 
ministic and probabilistic choices. They are a dominant model in corporate 
finance, supply chain optimization, and system verification and optimiza- 
tion. While there are many slightly different variants of this formalism (e.g., 
action-labeled MDPs [Bel57, FV97], probabihstic automata [SL95, SdV04]), 
we work with the state-labeled MDPs from [BdA95]. 



P^ [culprit 



u I observation = 0]= P^ [culprit = u] 
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The set of all discrete probability distributions on a set S is denoted by 
Distr(5'). The Dirac distribution on an element s € 5" is written as 1^- We 
also fix a set V of propositions. 

Definition 2.2.1. A Markov Decision Process (MDP) is a four-tuple 11 = 
{S, so,T, L) where: S is the finite state space of the system, sq G S is the 
initial state, L: S ^ p(^) is a labeling function that associates to each 
state s € S a subset of propositions, and t: S ^ p(Distr(S')) is a function 
that associates to each s G S a non-empty and finite subset of of successor 
distributions. 



{B.P} 



In case \t{s) \ = 1 for all states s we say that 11 is a Markov Chain. 
We define the successor relation q S x S 
hy g = {{s,t) \ 3tt e r(s) . 7r(t) > 0} 
and for each state s G S we define the 
sets Paths(s) = {sqSiS2 . . . G 5^ | sq = 
s A Vn G N . g(s„,Sn+i)}, and Paths*(s) = 
{sqSi . . . s„ GS'*|so = sAVO<i< 
n . g{sn, Sn+i)} of paths and finite paths re- 
spectively beginning at s. Sometimes we will 
use Paths(n) to denote Paths(so), i-e. the set 
of paths of n. For co G Paths(s), we write 
the n-th state of oj as a;„. In addition, we 
write o"! C a2 if cr2 is an extension of 0"i, i.e. 
C2 = CTia' for some a'. We define the basic cylinder of a finite path a as the 
set of (infinite) paths that extend it, i.e (a) = G Paths(s) | o" 1^ w}. For 
a set of paths R we write {R) for its set of cylinders, i.e. {R) = [j^^niu). 
As usual, we let Bg C p(Paths(s)) be the Borel a-algebra on the basic 
cylinders. 



{B,P} 




Figure 2.1: MDP 



Example 2.2.2. Figure 2.1 shows a MDP. States with double lines repre- 
sent absorbing states (i.e., states s with r(s) = {Is}) and a is any constant 
in the interval [0, 1]. This MDP features a single nondeterministic decision, 
to be made in state S2- 
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Schedulers (also called strategies, adversaries, or policies) resolve the non- 
deterministic choices in a MDP [PZ93, Var85, BdA95]. 

Definition 2.2.3. Let 11 = (5,so,r,L) be a MDP and s £ S. An s- 

scheduler r] for n is a function from Paths*(s) to Distr(p(Distr(5))) such 
that for all a G Paths*(s) we have rj{a) G Distr(r(last(cj))). We denote the 
set of all s-schedulers on 11 by Sch5(n). When s = sq we omit it. 

Note that our schedulers are randomized, i.e., in a finite path a a scheduler 
chooses an element of r(last(cj)) probabilistically. Under a scheduler rj, 
the probability that the next state reached after the path cj is t, equals 
S7rgr(iast(o-) ^('^)('^) ' "^(O- ^his Way, a scheduler induces a probability 
measure on Bg defined as follows: 

Definition 2.2.4. Let n be a MDP, s G 5, and r] an s-scheduler on H. 
The probability measure P^.r; is the unique measure on Bg such that for all 
sqSi . . . Sn G Paths* (s) 

n-l 

^^.^^((SOSI . . . Sn)) = n X] ??(soSi . . . Si)(7r) • 7r(si+i). 
i=0 7rGr(si) 

Often we will write Pr;(A) instead of Ps^^(A) when s is the initial state 
and A £ Bg. We now recall the notions of deterministic and history inde- 
pendent schedulers. 

Definition 2.2.5. Let n be a MDP, s £ S, and t] an s-scheduler for H. We 
say that rj is deterministic if 77(0") (vr) is either or 1 for all vr G T(last(cr)) and 
all cr G Paths* (s). We say that a scheduler is history independent (HI) if for 
all finite paths cJi, cj2 of n with last((Ji) = Iast(cj2) we have r]{ai) = r/((j2). 

Definition 2.2.6. Let n be a MDP, s G 5, and A G Bg- Then the maximal 
and minimal probabilities of A, P^(A),P7(A), are defined as 

P+(A) = sup P.,^(A) and Fg{A) = inf Ps,,,(A). 

ryeSch^n) ' r;eSchs(n) 

A scheduler that attains P^(A) or P~(A) is called an optimizing scheduler. 
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We define the notion of (finite) convex combination of schedulers. 

Definition 2.2.7. Let 11 be a MDP and let s £ S. An s-scheduler is a 
convex combination of the s-schedulers r/i, . . . , ?]„ if there are qi, . . . , € 

[0, 1] with aiH han = 1 such that for aU A € Bs, IPs,r?(A) = aiPs,r,i(A) + 

••• + a„P,,^„(A). 

Note that taking the convex combination 77 of r]i and r/2 as functions, i.e., 
r){a){7T) = arji{a){7r) + (1 — 0)7/2 (iT)(7r), does not imply that r/ is a convex 
combination of r]i and 772 in the sense above. 

2.3 Conditional Probabilities over MDPs 

The conditional probability P(A \ B) is the probability of an event 
given the occurrence of another event B. Recall that given a probability 
space {Q., F, P) and two events A, B e F with P{B) > 0, P{A \ B) is 
defined as P{Ar\ B) / P{B). If P{B) = 0, then P{A \ B) is undefined. 
In particular, given a MDP 11, a scheduler rj, and a state s, consider the 
probabilistic space (Paths(s), S^, P^^^). For two sets of paths Ai, A2 G Bg 
with Ps^^(A2) > 0, the conditional probability of Ai given A2 is Ps,r;(Ai | 



A2) = P,,^(AinA2)/P.,^(A2). If P.,r,(A2) = 0, then P^,,(Ai | As) is 



undefined. We define the maximum and minimum conditional probabilities 
for all A2 G Bs as follows: 

Definition 2.3.1. Let 11 be a MDP. The maximal and minimal condi- 
tional probabilities P^(Ai|A2), P7(Ai|A2) of sets of paths Ai, A2 G Bs are 
defined by 





otherwise. 



if Sch>° 




otherwise. 



where Sch 
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The following lemma generalizes Lemma 6 of [BdA95] to conditional prob- 
abilities. 

Lemma 2.3.2. Given Ai, A2 € Bs, its maximal and minimal conditional 
probabilities are related by: P+(Ai|A2) = 1 - P7(Paths(s) - A1IA2). 

2.4 Conditional Probabilistic Temporal Logic 

The logic cpCTL extends pCTL with formulas of the form P(^^[i;^)|'0] where 
cxi€ {<, <, >, >}. Intuitively, P<;jj[(;^|V'] holds if the probability of (p given 
^ is at most a. Similarly for the other comparison operators. 

Syntax: The cpCTL logic is defined as a set of state and path formulas, 
i.e., cpCTL = Statu Path, where Stat and Path are defined inductively: 

V C Stat, 

V'CStat =^ (/) A -0, -■(/) G Stat, 

G Path ^ P^J0],P^J</)|V] G Stat, 

(/), -0 G Stat =^ (t>Uij,<)(t),0 (p & Path . 

Here ixjG {<, <, >, >} and a G [0, 1]. 

Semantics: The satisfiability of state-formulas {s \= (p for a state s) and 
path-formulas (u \= ^p for a path u) is defined as an extension of the sat- 
isfiability for pCTL. Hence, the satisfiability of the logical, temporal, and 
pCTL operators is defined in the usual way. For the conditional probabilis- 
tic operators we define 

s ^F^^[(j)\tp] ^ P+({t^ G Paths(s) I a; ^ (/>}|{a; G Paths(s) I w ^ V}) < a, 
s \= FyJ^cplip] ^ KiW e Paths(s) I 00 \= 4>}\{uj G Paths(s) \ lo \= i;}) > a, 

and similarly for s \= F^^[(f)\ip] and s |= P>q[i;^>|^]. We say that a model M. 
satisfy cp, denoted hy ^A \= (p '^^ sq \= (p. 
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In the following we fix some notation that we will use in the rest of the 
chapter, 

P+ M ^ P+ ({^^ePaths(s) I 
Ffi^liP] = rf{{Lo G Paths(s) I CO ^ (j)}\{io G Paths(s) | w \= -0}), 
P,^^[</.|V] = ^sAW e Paths(s) I Lo ^ (f>}\{uj G Paths(s) | w ^ V}), 
P7[</'|V'] and ^714'] are defined analogously. 

Observation 2.4.1. As usual, for checking if s ^ Pj^^f^lV'], we only need 
to consider the cases where (j) = (j)iU(j)2 and where ifj is either 'tpiU'ip2 or 
Oipi- This follows from o trueU(j), Ocj) o the relations 

P+h0|^] = 1 - and F-^m = 1 - P+[0|^] 

derived from Lemma 2.3.2. Since there is no way to relate P"'"[(/>|'i/)] and 
P+[i;^|-ii/^], we have to provide algorithms to compute both ¥'~^[(l)\ip\lAip2\ 
and P~''[(/>|n'0i]- The same remark holds for the minimal conditional prob- 
abilities P~[(^|'i/'iZ^'i/'2] and P~[(^|n^/'i]. In this chapter we will only focus 
on the former problem, i.e., computing maximum conditional probabilities, 
the minimal case follows using similar techniques. 

2.4.1 Expressiveness 

We now show that cpCTL is strictly more expressive than pCTL. The 
notion of expressiveness of a temporal logic is based on the notion of formula 
equivalence. Two temporal logic formulas (j) aiid "0 are equivalent with 
respect to a set V of models (denoted by (j) =x> ip) if for any model m £ V 
we have m \= (j) and only m \= ip. A temporal logic C is said to be at 
least as expressive as a temporal logic C , over a set of models V, if for any 
formula (f) C there is a formula ip £ £. that is equivalent to (p over V. 
Two temporal logics are equally expressive when each of them is at least as 
expressive as the other. Formally: 

Definition 2.4.1. Two temporal logics C and C are equally expressive 
with respect to V if 

yep G c. {Btp G c'.cp =v tp) Ay^pe a. {^cp g l4 =v ip) . 



32 



Chapter 2. Conditional probabilistic temporal logic 



Theorem 2.4.2. cpCTL is more expressive than pCTL with respect to 
MCs and MDPs. 

Proof. Obviously cpCTL is at least as expressive as pCTL, hence we only 
need to show that the reverse does not hold. The result is rather intuitive 
since the semantics of the conditional operator for cpCTL logic is provided 
by a non-linear equation whereas there is no pCTL formula with non-linear 
semantics. 

The following is a formal proof. We plan to show that there is no 
pCTL formula ip equivalent to (p = ¥^q ^[()A\()B], with A and B atomic 
propositions. The proof is by cases on the structure of the pCTL formula 
tp. The most interesting case is when ip is of the form P<j[^], so we will 
only prove this case. In addition we restrict our attention to 6's such that 
< 6 < 1 (the cases 6 = and 6 = 1 are easy). In Figure 2.2 we depict the 
Markov Chains involved in the proof. We use ^ipi to mark the states with 
an assignment of truth values (to propositional variables) falsifying tpi. 

Case^ = P<JOVi]: 

If ipi is true or false the proof is obvious, so we assume otherwise. We 
first note that we either have ^ipi => ^{B A ^A) or -^ipi =^ {B A ^A). 
In the former case, it is easy to see (using ^B =^ ^i) that we have 
1712 \= 4> and 1712 ^ ip- In the second case we have mi ^ (p 
mi \= ifj. 

Case V = P<b[^i^^V'2]: 

We assume ipi ^ true, otherwise we fall into the previous case. We 
can easily see that we have m^ \= ip but y= (p. 

Case V = F<bPV'i]: 

The case when ipi = true is easy, so we assume ipi ^ true. We can 
easily see that we have ms |= ip but ms ^ □ 

Note that, since MCs are a special case of MDPs, the proof also holds 
for the latter class. 

We note that, in spite of the fact that a cpCTL formula of the form 
P<^[(/>|?/^] cannot be expressed as a pCTL formula, if dealing with fully 
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{AB} -Vi {A} {B} {A,B} 

Figure 2.2: Markov Chains mi, m2, and respectively. 



probabilistic systems (i.e. systems without nondeterministic choices) it is 
still possible to verify such conditional probabilities formulas as the quotient 
of two pCTL* formulas: Pf^lV'] = However, this observation does 

not carry over to systems where probabilistic choices are combined with 
nondeterministic ones (as it is the case of Markov Decision Processes) . This 
is due to the fact that, in general, it is not the case that P"'"[(^|V'] = ^p+ri^f ^ • 



2.5 Semi History-Independent and Deterministic 
Schedulers 

Recall that there exist optimizing (i.e. maximizing and minimizing) sched- 
ulers on pCTL that are HI and deterministic [BdA95]. We show that, 
for cpCTL, deterministic schedulers still suffice to reach the optimal condi- 
tional probabilities. Because we now have to solve a non-linear optimization 
problem, the proof differs from the pCTL case in an essential way. We also 
show that HI schedulers do not suffice to attain optimal conditional proba- 
bility and introduce the family of semi history-independent schedulers that 
do attain it. 

2.5.1 Semi History-Independent Schedulers 

The following example shows that maximizing schedulers are not necessarily 
HI. 
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Example 2.5.1. Let U be the MDP of Figure 2.3 
and the conditional probabiUty ¥g^^^[()B\()P]. There 
are only three deterministic history independent sched- 
ulers, choosing tti, 7T2, or tt^ in sq. For the first one, 
the conditional probability is undefined and for the sec- 
ond and third it is 0. The scheduler rj that maximizes 
^so,v\.^^\^P] satisfies r]{so) = tts, ??(soS3) = tts, and 
vi^ossso) = TTi- Since rj chooses on sq first 7r2 and later 
TTi, r] IS not history independent. 




Figure 2.3: MDP 



Fortunately, as we show in Theorem 2.5.3, there exists a nearly HI scheduler 
that attain optimal conditional probability. We say that such schedulers 
are nearly HI because they always take the same decision before the system 
reaches a certain condition (p and also always take the same decision after 
ip. This family of schedulers is called (/j-semi history independent ((/9-sHI for 
short) and the condition ip is called stopping condition. For a pCTL path 
formula (p the stopping condition is a boolean proposition either validat- 
ing or contradicting (p. So, the (validating) stopping condition of 00 is 4> 
whereas the (contradicting) stopping condition of is ^(j). Formally: 



StopC((?;)) 



if. 
if. 



Similarly, for a cpCTL formula P|>^(j[(?i)|'i/'], the stopping condition is a 
condition either validating or contradicting any of its pCTL formulas {<j), 
tP), i.e., StopC(P^J</>|V']) = StopC((/)) V StopC(V'). 

We now proceed with the formalization of semi history independent 
schedulers. 



Definition 2.5.2 (Semi History-Independent Schedulers). Let H be a MDP, 
r] a scheduler for H, and (/> V ^ € Stat. We say that r/ is a ((/> V V') semi 
history-independent scheduler {{(f) V ^)-sHI scheduler for short) if for all 



2.5. Semi History-Independent and Deterministic Schedulers 



35 



ci,cT2 e Paths*(s) such that last(cri) = last((T2) we have 

^1,(72 ^ 0(0 V -0) ^ ^('^i) = 'n{^2)i and {HI before stopping condition} 
o'i)f2 \= 00 ^ ??(<7i) = r]{a2),and {HI after stopping condition} 
fi) ^2 1= OV' =^ ^(o"!) = rj{a2)- {HI after stopping condition} 

We denote the set of all (^-sHI schedulers of H by Sch'^(n). 

We now prove that semi history-independent schedulers suffice to attain 
the optimal conditional probabilities for cpCTL formula. 

Theorem 2.5.3. Let H be a MDP, 0,-0 S Path, and (p = StopC(0) V 
StopC(0'). Assume that there exists a scheduler r] such that P^[0] > 0. 
Then: 

P+[0|V'] = sup P^[0|0^]. 

»7GSch'^(n) 

(If there exists no scheduler r/ such that P^['0] > 0, then the supremum is 0.) 

The proof of this theorem is rather complex. The first step is to prove 
that there exists a scheduler 77 HI before the stopping condition and such 
that P^[0|V'] is 'close' (i.e. not further than a small value e) to the optimal 
conditional probability P"'"[0|'0]. For this purpose we introduce some defini- 
tions and prove this property first for long paths (Lemma 2.5.5) and then, 
step-by-step, in general (Lemma 2.5.6 and Corollary 2.5.1). After that, we 
create a scheduler that is also HI after the stopping condition and whose 
conditional probability is still close to the optimal one (Lemma 2.5.7). From 
the above results, the theorem readily follows. 

We now introduce some definitions and notation that we will need for 
the proof. 

Definition 2.5.4 (Cuts). Given a MDP H we say that a set K C Paths* (H) 

is a cut of H if ii' is a downward-closed set of finite paths such that every 
infinite path passes through it, i.e. 

• VcTi € K .\/a2 G Paths*(n) .(Ji C (T2 =^ (T2 G K, and 
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• Vo; e Paths(n) .3a e K .a \Z uj. 

where ci 1^ a2 means that (T2 is an "extension" of ai, i.e. 0"2 = cria' for 
some path a' . We denote the set of ah cuts of 11 by K{IV). 

For R C Paths* (s), we say that rj is history independent in R if for ah 
ai,a2 € R such that last(cji) = last(cJ2) we have that r]{ai) = ?7(o"2). We 
also define the sets <^> and ^ as the set of finite paths vahdating (j) and ijj 
respectively, i.e. $ = {ct G Paths*(n) \ a \= (I)} and ^ = {a £ Paths*(n) | 
a \= il^}. Finally, given a MDP 11, two path formulas (j), tp, and e > we 
define the set 

/C = {{K, T]) £ K{U) X Sch(n) I ^> U ^' C and is HI in K \ ($ U 

and F+[(P\^P] -F^[cj)\ij] < e} 

If a scheduler is HI in X \ U ^) then we say that rj is HI before the 
stopping condition. 

Lemma 2.5.5 (non emptiness of /C). There exists {K, rj) such that {K, t]) G 
IC and that its complement K'^ = Paths* (H) \ K is finite. 

Proof. We show that, given formulas (p, ip and e > 0, there exists a cut K 
and a scheduler if such that K'' is finite, <I>U'I' C K, rf is HI in K\{(^U'^), 
and P+[(/>|'(/'] - < e . 

The proof is by case analysis on the structure of (f) and ip. We will 
consider the cases where (p and tp are either "eventually operators" (0) or 
"globally operators" (□), the proof for the until case follows along the same 
lines. 

• Case (p is of the form ()(p and ip is of the form 0^: 

Let us start by defining the the probability of reaching (p in at most N steps, 
as ¥^[<N,()(p\ = P^[({a G Paths*(n) | a ^ O-/- A |a| < N})\. Note that for 
all pCTL reachability properties ()(p and schedulers r] we have 

lim ¥J<N,0<P]=FA(><P]. 



We also note that this result also holds for pCTL* formulas of the form 

0(p^Oi). 
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Let us now take a scheduler rj and a number N such that 



P+[O0|OV'] - ^r,[0(b\Oi^] < e = e/3, and (2.1) 
P^IO'/'AO^/'] -P^[<iV,O0AOV'] < e', and (2.2) 
-F^[<N,0^p]<e'. (2.3) 



where e' is such that e' < min (2 • e • ¥ [Otp], p \()^A0,h]4-tl-¥ \6',b] ) ■ "^^^ 



■ »?LV<fJ, p^[0^A0V']+2-€-P^[0'!/'], 

sons for this particular choice for the bound of e' will become clear later on 
in the proof. 

We define as $ U ^ U Paths* (< iV, 11), where the latter set is defined 
as the set of paths with length larger than A^, i.e. Paths*(< A^, 11) = {cr G 
Paths*(n) I N < |(t|}. In addition, we define r]* as a scheduler HI in 
Paths*(< A^, n) behaving like r] for paths of length less than or equal to 
which additionally minimizes P[OV'] after level A^. In order to formally 
define such a scheduler we let Sn to be the set of states that can be reached 
in exactly A^ steps, i.e., Sn = {s £ S \ 3a £ Paths* (11) : = A^ A 
last((7) = s}. Now for each s G S" we let to be a HI s-scheduler such that 
^JOV'] = Note that such a scheduler exists, i.e., it is always 

possible to find a HI scheduler minimizing a reachability pCTL formula 
[BdA95]. 

We now define r]* as 



r?*(a) 



Csio'\a\cr\a^i ■ ■ ■'^|cr|) if « C o" for some a G Paths*(= A^, H) 

such that last(a) = s, 
^r]{a) otherwise. 



where Paths*(= A^, H) denotes the set of paths of H of length A^. It is easy to 
see that rf minimizes P[<C>V'] after level A^. As for the history independency 
of rf in K there is still one more technical detail to consider: note there may 
still be paths aisiait and a2S2(y2t such that aisi, a2S2 G Paths*(= A^, H) 
and ^sx{si(yit) 7^ £,s2{s2<y2t)- This is the case when there is more than one 
distribution in r(t) minimizing PjOV'li and happens to choose a different 
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(minimizing) distribution than for the state t. Thus, the selection of 
the family of schedulers {Cs}se5jv niust be made in such a way that: for all 
si,S2 G Sn we have P,,,5^J0^] = P^JO^], ^S2,^J0'P] = P.'JOV'], and for 
all ait G Paths* (si ), (T2t G Paths*(s2) : ^^^(crit) = ^^2(^20- I* is easy to 
check that such family exists. We conclude that rj* is HI in Paths*(< A^, 11) 
and thus HI in \ ($ U ^'). 

We note that P^*[0'iA] > 0, this follows from < P^[O'0], (2-1), (2.3), 
and the definition of rj*. 

Having defined ry* we proceed to prove that such scheduler satisfies 
P+[(/)|V'] - P„[(^|'(/'] < e- It is possible to show that: 



P,,[<iV,0^]< P^*[OV']< r^iO^], (2.4) 

F^[<N,0(f>AOi^]< ¥^40(f> /\ 04^] < ¥^[0(t>AO^l^] + e■¥^[0^l^]. (2.5) 

(2.4) and the first inequality of (2.5) follow straightforwardly from the def- 
inition of rj*. For the second inequality of (2.5) suppose by contradiction 
that P„*[0<^ A OV'] > ¥n[04> /\Oip] + e- f [04^]. Then 



^AO^^O^] > P,[O0AOV.] + .-P,[OV.] ^ 
P,.[0^] - P,[OV] + 6 

contradicting (2.1). 

Now we have all the necessary ingredients to show that 

\F^m0i^]-¥^40ct>m]\<2-e. (2.6) 

Note that 



P^[0</>AO^]-e' P,[O0AOV] + e-P, 
— <P^* [0010V'] and P,,* [001 OV'] < — 1 



The first inequality holds because P^*[OV'] < P?y[O0] and (combining (2.5) 
and (2.2)) P,,*[O0 A 00] > P^[O0 A Oip] - e'. The second inequality holds 
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because P^* [00 A Oip] < F^[O0 A04^] + e- F^[OTp] and (combining (2.4) and 
(2.3)) P^*[O'0] > ^niOi^] - e'- I* is easy to see that P^[O(?!>|O'0] falls in the 
same interval, i.e., both P^[0<?5>|0^] and P^*[0(A|OV'] are in the interval 

P,, [0<PAO^]-e' P A 0^] + e • P^ [0^] 



Thus, we can prove (2.6) by proving 



P.JO^AOV'] P„[O0AOV'] -e' , 

- < 2 • e, and 



\[0(j)A0i^]+e-¥^[0ij] P^ [<></) A 0^] 



<2e. 



The first inequality holds if and only if e' < 2 • e • P^[OV']- As for the second 
inequality, we have 

^ • e + P^[0<^ A 0^] • e' < 2 • e • (P^[OV] - e') • F^lOV'] 

^ P^JOV^]^ • e + P^[0<^ A 0^] • e' < 2 • e • P^iO^]^ - 2 • e • e' • P^[OV'] 



e' < 



P^[O0AO-0]+2-e-P^[OV]' 



We conclude, by definition of e', that both inequalities hold. 

Now, putting (2.1) and (2.6) together, we have P+[0<^|0^]-P,,* [0(p\O^P] < 
3 • e = e, which concludes the proof for this case. 
• Case (j) is of the form 00 and ip is of the form Dip: 
We now construct a cut K and a scheduler rj* such that K"^ is finite, $U^' C 
K, rf is HI in K \ (^> U ^'), and P^* [□^0|nV'] - P-p^^pV'] < e. Note 
that such a cut and scheduler also satisfy P+[O</'P'0] ~ l^r;* [^^I'-'V'] < ^• 

The proof goes similarly to the previous case. We start by defining 
the probability of paths of length always satisfying as P„[=A^, = 
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P^[({cj € Paths''(n) I fj h A |cr| = N})]. Note that for all pCTL formula 
of the form □(/> and schedulers r] we have 

Jim F^[=N,a4>]=F^[acj)]. 

The same result holds for the pCTL* formula □(</) A '(/'). It is easy to check 
that for ah N and (f) we have P^[= A^, □</>] > lPr,[n</']- 

Now we take a scheduler r] and a number such that: 

< ¥^\U^(l)\Uip] - I'-[U^(t)\Uip] < e = e/3, and 
< P^[=iV,n(^,/. A V)] -P^P(-0A^)] < e', and 

e • P^ [Oil^] , p [ □'(-,^/\^) ] ) • 

We define K as before, i.e., K = ^> U ^' U Paths* (< iV, 11). In addition, 
we can construct (as we did in the previous case) a scheduler rj* behaving 
as T] for paths of length at most N and maximizing (instead of minimizing 
as in the previous case) P[n'0] afterwards. Again, it is easy to check that 
r/* is HI in \ ($ U ^'). 

Then we have 

p.pV'] < P^.pV'] < P,[=iv,nV'], 

P^p(-</.AV)] -e-P^PV'] < P,,*P(-<^AV)] < P^[=iV,n(-0AV')]. 
In addition, it is easy to check that 



-P,pV]-e < P^.p(-,/.AV)]-P^P(-0A^)] < e' 
< P^*p^] -P^pV'] < e'. 

Similarly to the previous case we now show that 

|P^p-0pV]-P,.p-<^PV']| <2-e. (2.7) 

which together with P^[n-i(/>p';/'] — P~[n-i(/>p^] < e concludes the proof. 
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In order to prove (2.7) we show that 

-2-e<P^*p^</.p^]-P^p^(/>|nV'] <e 
or, equivalently 

a) P^. p(-,^ A n ■ P.PV'] - P,P(-<^ A V)] • P,* < 
P.^PV'] •P^^PV'] • e, and 

b) 2-P^p^/']-P^.pV']-e< 

P^^. Ph,^ A V)] • P.PV'] - IP^P(-0 A V)] • P,* P^]. 

It is possible to verify that a) is equivalent to e' < e • P^[n';/'] and that 

b) is equivalent to e' < p "^[d^^aV)] ' '^^^ desired result follows by definition 
of e'. ' □ 

In the proof of the following lemma we step-by-step find pairs (iiT, r/) in 
fC with larger K and r] still close to the optimal until finally K is equal to 
the whole of Paths* (n). 

Lemma 2.5.6 (completeness of /C). There exists a scheduler r/ such that 
(Paths* (n),r/) € /C. 

Proof. We prove that if we take a {K,r]) G /C such that \K'^\ is minimal 
then K'^ = or, equivalently, K = Paths*(n). Note that a pair {K,r]) with 
minimal \K^\ exists because, by the previous lemma, /C is not empty. 

The proof is by contradiction: we suppose ^ and arrive to a 
contradiction on the minimality of \K^\. Formally, we show that for all 
{K, 77) € /C such that ^ 0, there exists a cut K* D K and a scheduler 
rj* such that {K*,rf) G /C, i.e. such that rf is HI in K* \{^\J^) and 

To improve readability, we prove this result for the case (f) is of the form 
()(f) and V' is of the form ()ip. However, all the technical details of the proof 
hold for arbitrary (j) and if). 

Let us start defining the boundary of a cut K as 

5K = {(Ti G I Vct2 G Paths*(M) . ^2 C tii =^ K}. 



42 



Chapter 2. Conditional probabilistic temporal logic 



Let p be a path in K'^ such that pt G 6K. Note that by assumption 
of 7^ such p exists. Now, if for all paths a G we have last(a) = 
last(p) =^ 77(0) = ri{p) then rj is also HI in {K U {/?}) \ ($ U ^) so we have 
[K U {p},??) € /C as we wanted to show. Now let us assume otherwise, i.e. 
that there exists a path q S \ (<I> U such that last(a) = last(p) and 
r]{a) 7^ r]{p). We let s = \ast{p), vri = ?](p), 112 — and Ks = {cr ^ K \ 

last((T) = s} \ ($ U ^'). Note that for all a' G i^^ we have 77(0') = 7r2, this 
follows from the fact that r/ is HI in K \ U 'f). 

Figure 2.4 provides a graphic representation of this description. The fig- 
ure shows the set Paths* (H) of all finite paths of H, the cut K of Paths* (H), 
the path p reaching s (in red and dotted border line style), a path a reach- 
ing s m. K (in blue and continuous border line style). The fact that i] takes 
different decisions p and a is represented by the different colors and line 
style of their respective last states s. 




Figure 2.4: Graphic representation of Paths*(n), <I> U K, 5K, p, and a. 

We now define two schedulers r/i and r/2 such that they are HI in {K U 
{/)}) \ (<& U ^). Both rji and 7/2 are the same than rj everywhere but in Ks 
and p, respectively. The first one selects tti for all a € Kg (instead of 1x2 as 
rj does), and the second scheduler selects 7^2 in p (instead of vri): 
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I TTi ifa £ Ks , , . J vr2 if a = p 

r]i[a) = < and = < 

I r]{a) otherwise I r]{a) otherwise. 

Now we plan to prove that either 7]i is "better" than r] or r/2 is "better" 
than 1]. In order to prove this result, we will show that: 

ip.jo^iov'] < ¥^,mm^^^,mm < ^jO(t>m (2.8) 

and 

ip.jo^iov'] < F,jom^]<^K,[oct>m < ip.joc/'io^] (2.9) 

Therefore, if P,-,JO(/'|O'0] < P^^ [O(/'|O'0] then we have {KU{p},r]2) G JC, and 
otherwise {K U G /C. So, the desired result follows from (2.8) and 

(2.9). We will prove (2.8), the other case follows the same way. 

In order to prove (2.8) we need to analyze more closely the conditional 
probability P[<C></>|OV'] — Pi'^l^] for each of the schedulers r], 7]i, and r]2. For 
that purpose we partition the sets $ n and ^' into four parts, i.e. disjoint 
sets. The plan is to partition $n^' and ^ in such way that we can make use 
of the fact that rj, rji, and r]2 are similar to each other (they only differ in 
the decision taken in Ks or p) obtaining, in this way, that the probabilities 
of the parts are the same under these schedulers or differ only by a factor 
(this intuition will become clearer later on in the proof), such condition is 
the key element of our proof of (2.8). Let us start by partitioning 

i) We define as the set of paths in ^ neither passing through Kg 
nor p, formally 

*M.=^\((^'^)U(P)) 

ii) We define ^p^^ as the set of paths in ^ passing through p but not 
through Kg, i.e.: 

^,1. = ^n((p)\(J^.)). 

iii) We define ^p,ks ^ tti^ set of paths in passing through p and Kg, 



I.e.: 
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iv) We define as the set of paths in ^ passing through Kg but not 

through p, i.e.: 

Note that ^ = ^p,, U ^ ^^^^ U U 

Similarly, we can partition the set of paths <I> n ^' into four parts ob- 
taining $ n ^' = n ^')p,, u n u ($ n ^')p,fc,($ n ^')-^^. 

In the following we analyze the probabilities (under rj) of each part 
separately. 

• The probability of ^p^^ can be written as Pp-x^, where pp is the prob- 
ability of p and is the probability of reaching ip without passing 
through Kg given p. More formally, P^[^^^J = P^[^' n {{p) \ {Kg))] 

= ¥^[{p)] ■ PJ^ n Hp) \ {Kg))\{p)] ^ Pp ■ x^.^ 

• The probability of can be written as pp ■ Xg ■ , where Xg is 
the probability of passing through Kg given p, is the probability 
of, given a, reaching ip without passing through Kg after a; and yg 
is the probability of, given a, passing through Kg again. Remember 
that a is any path in Kg. Formally, we have 

ip,[*mJ = P,[*n(p)n(i^,)] 

= ¥^[{p)]-^^[{Kg)\{p)]-¥^mKg)r\{p)] 

= pp-xg-¥^[^{a)\. 



Furthermore, 



^ F^IITIX, again 1(a)] 
P^[i?sagainlW] 



where i^s again = (a) \ G (ao") | a<T G ifj. 

The probability of ^'p,fcs can be written as p^^ • , where is the 
probability of passing though Kg without passing through p. For- 



mally, P,[%,fcJ = P,[^ n {{Kg) \ {p))] = P,,[(A',) \ {p)] • F^^a)] 

■f^Sfc 1-ys 



A 
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• Finally, we write the probability of as p^. 

A similar reasoning can be used to analyze the probabilities associated 
to the parts of ^> n In this way we obtain that (1) P^[(^> n = 
Pp ■ Xfj,^ , where x^^p is the probability of reaching cp and Tp without passing 
through given p, (2) P^[($ n ^')p,fcj = Pp ■ Xs ■ j^, where y^^ is the 
probability of reaching (p and ^p without passing through Ks afterwards 
given a, (3) P^[(<^ n = pk^ ■ and (4) P^[(cD n ^)^^^J = p^^. 

In order to help the intuition of the reader, we now provide a graphical 
representation of the probability (under rj) of the sets $ n ^' and "if by 
means of a Markov chain (see Figure 2.5). The missing values are defined 
as P^^=P^-PH^P%-^- Psk -Pp-Pi>', and similarly for x^^, x^, y^^, 
and 2/0. Furthermore, absorbing states (ptp denote states where (pA^p holds, 
absorbing states (pip denote states where ^(p A tp holds, and ^p denote a 
state where -■■0 holds. Finally, the state p represents the state of the model 
where p has been just reached and a a state where any of the paths a in 
Kg as been just reached. To see how this Markov Chain is related to the 
probabilities of $ H ^' and ^ on the original MDP consider, for example, 
the probabilities of the set $ n It is easy to show that 

P^[a> n = P^[$,,fcJ + PJ$- ,j + p^[$^-j + p^[$_- ] 

= +Pp-x^^+Pp-x,-^+ps,-^= Pm[O0^]. 

We note that the values Ps^, Pp, Pff,^, p^, and p0 coincide for r/, rji, and 
r]2- Whereas the values x = (rr^, x^, x^^, X0) coincide for t] and rji and the 
values y = {ys-,y^,y<f>ii-,y(ii) coincide for r] and r/2. Thus, the variant of iW^ 
in which y is replaced by x describes the probability of each partition under 
the scheduler rji instead of rj. Similarly, the variant on which x is replaced 
by y represents the probability of each partition under the scheduler 772- 

Now we have all the ingredients needed to prove (2.8). Our plan is to 
show that: 

1) P^^J^I^] < P^J^-I*] ^ {pp+ps,)-d<{) ^ d<0, and 
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Figure 2.5: Graphical representation of how we write the probabihty of 
each partition: M^. 



2) F,^m<Fj<^\^] 



{l-ys)-Pp-d<0 



d<0. 



where d is the fohowing determinant 
d = 



We now proceed to prove 1) 



p^+Pp■x^+{pp■Xs+Ps^.)■Y=^ 

p^{l-Xs)+PpX^+Psf.X^, 

- Xs)+ PpX^jp + PskX^iP 
P^{1 - Xs) + PpX^ + Psi^X^ 



Xs-l 



- 1 



■ m 



[^>|^] < 



p.i>i,+Pp-y^i,+{pp-yB+Ps^)-^^;^ ^ Q 
Pi,+Pp-yi,+{pp-ys+psf,)-j^ ~ 



P(t>^ ( ^-ys)+Pa y4,4, y4,4, 
Pi:(^-ys)+Ppyi,+Ps^yi, 



< 



PH'i'^ - Vs) + Ppy4>^ + Psky(i>i, 
p^{i - ys) +Ppy^+Pskyi) 



< 0. 
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A long but straightforward computation shows that the 2x2 determinant 
in the hne above is equal to {pp + Psi^)d. 

The proof of 2) proceeds along the same lines. 

Pi,+Pp■o!:4, + {pp■Xs+Ps^.)■-^^ Pi;+Pp-yi,+{Pp-ys+Psf,)-j^ ~ 

P4,^{'^-ys)+Ppx^^{i-ys)+{ppXs+Ps^)y4,i, _ P4,i,0^-ys)+Ppy4,i,+P8^yci,ii, ^ g 
p^{'^-ys)+Ppx-^,(i-ys)+{ppXs+Ps,.)yi; Pi,{'i--ys)+Ppyi,+Ps^.yij — 



p<i>^ i'i--ys)+Ppx^^{i-ys) + (ppXs +Psk) y<t>^ PH {'^-ys)+Ppy<i>^+Psk y<t>i(> 
P^{l-ys)+PpX^il-ys) + {ppXs+PsJyij p^{l-ys) +Ppy,p +Psk yi> 

< 

and also here a long computation shows that this last 2x2 determinant is 
equal to (1 — ys) ■ Pp ■ d. □ 

Finally, we have all the ingredients needed to prove that there exists a 
scheduler close to the supremum which is HI before the stopping condition. 

Corollary 2.5.1. [HI before stopping condition] Let H be a MDP, 0, V' £ 
Path. Then for all e > 0, there exists a scheduler rf such that P"^[0|'i/;] — 
P^*[0|'i/'] < e and ry* is history independent before the stopping condition. 

Proof. Follows directly from Lemma 2.5.5 and Lemma 2.5.6. □ 

We now proceed with the construction of a maximizing scheduler and 
HI after the stopping condition. 

Lemma 2.5.7. [HI after stopping condition] Let H be a MDP, 4>,il^ & Path, 
and Lp = StopC((/)) V StopC(^/'). Then for all schedulers ry there exists a 
scheduler rj* such that 
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Proof. We will prove this result for the case in which (p is of the form 00 
and is of the form the proof for the remaining cases follows in the 
same way. 

Let us start by introducing some notation. We define, respectively, the 
set of paths reaching 0, the set of paths not reaching (j), the set of paths 
reaching (/) without reaching ip before, and the set of paths reaching ^ A ^(j) 
without reaching <j) before as follows 



= u (A^^ n A^) = [(A^^ n A,^) u (A^^ n A.^)] u (A^^ n A^). 



Let us now define the minimal set of finite paths "generating" (by their 
basic cylinders) A:;^_^ and A^^: i^^^ = {a e Paths*(n) | last((T) \= (p A^i < 
\a\ : Gi \= A -■■0} and similarly K-^^ = {cr € Paths'^(n) | last((7) \= 
{ijj A -10) AMi < \a\ : ai \= ^(/) A Note that A-:^^ = (if^^) and 

= (^m")- can write 



A^ ={uj £ Paths(n) 
A^^ ={u £ Paths(n) 

A^<^ e Paths(n) 

A^^ ^{a; G Paths(n) 



w N 00}, 

oj ^ A ^0)}. 



Note that the last two sets are disjoint. It is easy to check that 



A^ n A,^ = (A- n A^) u (A;t n A^) 



^,[0010^^] 



iPr,K%JnA^]+p,[(K^,)nA<^] 



The construction of the desired scheduler rf is in the spirit of the construc- 
tion we proposed for the scheduler in Lemma 2.5.5. We let 5",^ = {s € 5 | 
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s \= (j)} and S*^ = {s G 5 I s ^ {ip A^cj))}. Note that 5^ and are disjoint. 
Now we define two families of schedulers {^sIsgS^ and {Cs}sg5^ such that: 
for all si,S2 G we have P,,,^,JOV'] = P^JOV'], P.2,c,J0^]"= KiO^l 
and for all ait G Paths*(si), (72* G Paths*(s2) we have £,si{o'it) = ^52(172*). 
Similarly for {CsjseS^: for ah si,S2 G we have IP,,,^^JO0] = K^[0<P], 
^s2,Cs2 1'^'^] = IP+[O0], and for all crit G Paths*(si), ^2* G Paths*(s2) we 
have Ci(o-ii) = C2(<72i)- 

We now proceed to define rj*: 



S,s{a\a\ ■ ■ ■<7|(t|) if a E o" for some a&K^ such that last(a) = s, 
Cs('^|a| ■ ■ ■ ^\cr\) if a E c" for some a(zK^ such that last(a) = s, 
77(0") otherwise. 



where = {cr G Paths* | \ast{a) G S^}, and similarly i^^ = {a G Paths* | 
Iast(c7) G 5^}. 

It is easy to check that 77* satisfies 1) and 2). As for 3) we first note 

that F^[{K^^) n^]< n n ^r,[{K^^) n A^] < F^.[{K^^) n A,^], 

and P,[(i^^^) n A.^] > P^. [{K^) n A^^]. 

In addition, we need the following simple remark. 

Remark 2.5.8. Let / : M — M be a function defined as f{x) = where 
a and b are constants in the interval [0, 1] such that b > a. Then / is 
increasing. 

Finally, we have 



< 



MiKj,^) n A^] + F,[(K^^) n A^] + K,[{K^^) n A.^] 

{by Remark 2.5.8} 

p,.K^^^)nA^]+p^KK^^)nA^] 
V K^^<^) n A^] + P,[(K^^) n A^] + F^[{K^^) n A.^] 

{by Remark 2.5.8} 
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- P,. [{K^^) n Av,] + P,. [{K^^) n A^] + P, [(K^^) n A. 
P,4(K^^)nA^]+p^.[(ir^^)nA^] 



< 



.V [{K^^) n A^] + P^. [{K-^^) n A^] + P^. [{K^) n A^^] 
= P^,[0</'|OV'] □ 

Proof of Theorem 2.5.3. It follows straightforwardly from Corollary 

2.5.1 and Lemma 2.5.7. □ 

2.5.2 Deterministic Schedulers 

We now proceed to show that deterministic schedulers suffice to attain 
optimal conditional probabilities. 

The following result states that taking the convex combination of sched- 
ulers does not increase the conditional probability P [(/>!'(/']. 

Lemma 2.5.9. Let H be a MDP, s a state, and path formulas. Sup- 
pose that the s-scheduler is a convex combination of 771 and r/2. Then 

P,,^[</.|V] < max(P,,^j0|^],P,,^^j0|^]). 

Proof. To prove this lemma we need to use the following technical result: 
The function / : M ^ M defined as below is monotonous. 

^ XVi + {l- X)V2 
XWl + (1 — X)W2 

where vi,V2 € [0, 00) and wi,W2 S (0, 00). This claim follows from the fact 

that fix) = (^Ji^^^r-^r^)^ is ^i^ays ^ °^ ^i^ays ^ o- 

Now, by applying the result above to 



[0, l]Ba^ 



QP,,^J</>AV] + (l-a)P.,,,[</'AV] 



alPs,^JV'] + (l-«)IPs,,,[V'] 

we get that the maximum is reached at a = or a = 1. Because r/ is a 

convex combination of i]i and ??2i IPs,7?[</'lV'] ^ IP's, 772 1*?^ I''/'] case) 

or Pj,^^[(/)|^] < P^^^^J(/)|^] (in the second case). □ 
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Lemma 2.5.10. Let IT be a MDP, s a state, and a path formula. Then 
every ip-sHl s-scheduler on 11 is a convex combination of deterministic 99-sHI 
s-schedulers. 

Proof. The result follows from the fact that sHI schedulers have only finitely 
many choices to make at each state (at most two) and every choice at a 
particular state - either before or after the stopping condition- is a convex 
combination of deterministic choices at that state - either before or after 
the stopping condition. □ 

Finally, combining Theorem 2.5.3 and the previous lemma we obtain: 

Theorem 2.5.11. Let 11 be a MDP, (p^ip e Path, and Lp = StopC((/>) V 
StopC('0)- Then we have 

P+[</.|^] = sup F^i^l^P], 

r,GSch^(n) 

where Sch^(n) is the set of deterministic and (^-sHI schedulers of 11. 

Since the number of deterministic and semi HI schedulers is finite we 
know that there exists a scheduler attaining the optimal conditional prob- 
ability, i.e. sup^gSch^(n)lF'r,[</'l^] = max^gs^h^(n) IP,?[</'IV']- Note that this 
implies that cpCTL is decidable. 

We conclude this section showing that there exists a deterministic and 
semi HI scheduler maximizing the conditional probabilities of Example 
2.5.1. 

Example 2.5.12. Consider the MDP and cpCTL formula of Example 2.5.1. 
According to Theorem 2.5.11 there exists a deterministic and {B V P)-sHI 
scheduler that maximizes P^^^ ^[0-B|0-P]- In this case, a maximizing sched- 
uler will take always the same decision (tts) before the system reaches S3 (a 
state satisfying the until stopping condition (i? V P) ) and always the same 
decision (tti) after the system reaches S3. 
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2.6 Model Checking cpCTL 

Model checking cpCTL means checking if a state s satisfies a certain state 
formula (p. We focus on formulas of the form IP<a['^|V'] and show how to 
compute P+[(/)|'0]- The case [(j)\^p] is similar. 

Recall that model checking pCTL is based on the Bellman-equations. 
For instance, P+[0-B] = max^g^(,) EtGsucc(s) ^^(t) -P+IO^] whenever s ^ B. 
So a scheduler r] that maximizes Pg[0-B] chooses vr E r(s) maximizing 
X]tesucc(s)^(*) ■■'P'i^l^-^]- a successor state t, still behaves as a scheduler 
that maximizes Pj[0-B]- As shown below, such a local Bellman-equation 
is not true for conditional probabilities: a scheduler that maximizes a 
conditional probability such as Pg[0-B|nP] does not necessarily maximize 
Pf[0-B|nf ] for successors t oi s. 

Example 2.6.1. Consider the MDP and cpCTL formula P<^[OS|nP] 
of Figure 2.1. There are only two deterministic schedulers. The first 
one, 7/1, chooses 7r2 when the system reaches the state S2 and the sec- 
ond one, r/2, chooses vrs when the system reaches S2- For the first one 
W^^^^^^[OB\UP] = 1 - ^, and for the second one ¥ ,^^^^^[OB\UP] = §. So 
P+ [O^pi^] = max(l - ^, §). Therefore, if a > ^ the scheduler that 
maximizes P^JO^pi^] is m (P,,, ^JO^pP] = P+ [O^pi^]) and otherwise 
it is 771 (P,„,^JOBpP] =P+[Oi?'pP]). 

Furthermore, P+ [OSpP] = 1 and P+ [0-BpP] = 1 - 2a; the scheduler 
that obtains this last maximum is the one that chooses 7r2 in S2- 

Thus, if a > ^ the scheduler that maximizes the conditional probability 
from So is taking a different decision than the one that maximize the condi- 
tional probability from S2- Furthermore, max(l — |f ) = P^[C>-BpP] ^ 
fP+ [OSpi^] + iP+ [0-BpP] = 1 - f for all a G (0, 1], showing that the 
Bellman-equation from above does not generalize to cpCTL. 

As consequence of this observation, it is not possible to "locally max- 
imize" cpCTL properties (i.e. to obtain the global maximum P^Jc^l-i/^] by 
maximizing Pji;^|'i/'] for all states t). This has a significant impact in terms 
of model-checking complexity: as we will show in the rest of this section, 
to verify a cpCTL property it is necessary to compute and keep track of 
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several conditional probabilities and the desired maximum value can only 
be obtained after all these probabilities have been collected. 

2.6.1 Model Checking P<„[0|^] 

An obvious way to compute P^[i;^|'i/'] is by computing the pairs (P^ ^ [i;^> A "0] , 
^SjjjiV']) for all deterministic sHI schedulers r], and then taking the max- 
imum quotient Fg^^[4i A 'ip]/¥g^^[tp]. This follows from the fact that there 
exist finitely many deterministic semi history-independent schedulers and 
that one of them attains the maximal conditional probability; however, the 
number of such schedulers grows exponentially in the size of the MDP 
so computing these pairs for all of them is computationally expensive. 
Our plan is to first present the necessary techniques to naively compute 
(P^ A ip], ¥s,r][^]) for all deterministic sHI schedulers rj and then present 
an algorithm that allows model checking P<Q[</'|'i/'] without collecting such 
pairs for all sHI scheduler. 

1) A naive approach to compute P"'"[(?i)|^] 

The algorithm is going to keep track of a list of pairs of probabilities of the 
form (Pj^^[<^ A tp], Pj^^[V']) for all states t and rj a deterministic sHI scheduler. 
We start by defining a data structure to keep track of the these pairs of 
probabilities. 

Definition 2.6.2. Let L be the set of expressions of the form (pi, (/i) V • • - V 
(Pn, Qn) where pi, qi S [0, oo) and qi > pi, for all n G N*. On L we consider 
the smallest congruence relation =i satisfying idempotence, commutativity, 
and associativity, i.e.: 

{pi,qi)V {pi,qi) =1 {pi,qi) 
{pi,qi)V {p2,q2) =1 (P2,92) V 
V (P2,g2)) V (P3,g3) =1 V ((p2,g2) V (P3,g3)) 

Note that {pi,qi) V ••• V {pn,qn) =i (p'l , ^''i ) • • • (K' ' ) ^^'^ only if 

{{pi,qi),...,{Pn.qn)} = {{V'l.q'l).---Ap'n'^<l'n')}- 
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We let Li be the set of equivalence classes and denote the projection 
map -L — > Li that maps each expression to its equivalence class by /i. On 
L we also define maximum quotient T : L — )• [0, oo) by 



T \/{Pi,qi 



max( {^\q,^0,i = l,...,n}u{0} 



\i=l / 

Note that T induces a map Ti : Li — > [0, oo) making the diagram in 
Figure 2.6.1 (a) commute, i.e., such that Ti o = T. 

Definition 2.6.3. Let 11 be a MDP. We define the function 5 : S x 
Stat X Path x Path ^ L by 

r,GSch?(n) 

and we define Ji : 5 x Stat x Path x Path Li by (5i = /i o S. 




*-fO,cxj) 



(a) Commutative diagram 




(1.1) (0,1) (0,0) (01) (0,0) 



(b) (5- values 



When no confusion arises, we omit the subscripts 1 and omit the projection 
map fi, writing (pi, gi) V • • • V Qn) for the equivalence class it generates. 

Example 2.6.4. In Figure 2.6.1 we show the value 6{s,B V -.P, 0-B,nP) 
associated to each state s of the MDP in Figure 2.1. 



The following lemma states that it is possible to obtain maximum condi- 
tional probabilities using 6. 
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Lemma 2.6.5. Given 11 = (S", sq, L, r) an acyclic MDP, and (/>i, (^^2, V'Ij ''/'2 S 
Stat. Then 

and 

where 5^{4>iU(/)2 \ 1P1U1P2) = 5(s, StopC(0iW02) V StopC(V'i^/^2), 0iZ^</'2, 
'ipiU->p2) and 5^{(t)iU(t)2 \ D^i) = -5(s, StopC((^i^/(^2) V StopC(n^i), 0iZ^(/.2, 

□V'l). 

Proof. The lemma follows straightforwardly from the definitions of 5 and 
T and the fact that the maximum conditional probability is indeed reached 
by a deterministic sHI scheduler. □ 

Remember that there are finitely many sHI schedulers. Thus, 5 (and there- 
fore |— ]) can in principle be computed by explicitly listing them all. 
However, this is of course an inefficient way to compute maximum condi- 
tional probabilities. 

We now show how to compute P'^[— |— ] in a more efficient way. We will 
first provide an algorithm to compute maximum conditional probabilities 
for acyclic MDPs. We then show how to apply this algorithm to MDPs 
with cycles by mean of a technique, based on SCO analysis, that allows the 
transformation of an MDP with cycles to an equivalent acyclic MDP. 

2) An algorithm to compute P+[0| i/j] for Acyclic MDPs 

We will now present a recursive algorithm to compute P+[0|i/;] for acyclic 
MDPs using a variant of 5 (changing its image). As we mentioned be- 
fore, to compute maximum conditional probabilities it is not necessary to 
consider all the pairs (P^ [(/> A "0] , P,; ["0] ) (with r/ a deterministic and semi 
HI scheduler). In particular, we will show that it is sufficient to consider 
only deterministic and semi HI schedulers (see definition of D below) that 
behave as an optimizing scheduler (i.e. either maximizing or minimizing 
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a pCTL formula (p) after reaching the stopping condition (i.e. a state s 
satisfying StopC ((/?)). 

We plan to compute a function 6{—) C 5{—) such that T{6) = T{6). 
Intuitively, 6{—) can be thought as 

where D contains all deterministic and semi HI schedulers rj such that rj 
optimizes Ps^^[(^] for some s \= StopC((/9) and (f) € pCTL formula. 

This intuition will become evident when we present our recursive al- 
gorithm to compute conditional probabilities (see Theorem 2.6.11 below). 
The states s involved in the definition of D correspond to the base case 
of the algorithm and the formula (p corresponds to the formula that the 
algorithm maximizes/minimizes when such s is reached. 

We will present algorithms to recursively (in s) compute 6^ and 6^ in 
acyclic MDPs. The base cases of the recursion are the states where the 
stopping condition holds. In the recursive case we can express 6^ (respec- 
tively S^) in terms of the d]f (respectively S^) of the successors states t of 
s. 

We start by formalizing the notion of acyclic MDP. We call a MDP 
acyclic if it contains no cycles other than the trivial ones (i.e., other than 
selfloops associated to absorbing states). 

Definition 2.6.6. A MDP 11 is called acyclic if for all states s € 5 and all 
TT € t{s) we have 7r(s) = or 7r{s) = 1, and, furthermore, for all paths w, 
if there exist i,j such that i < j and Ui = Uj, then we have Wj = uj^ for all 
k > i. 

In addition, in order to formally define 6 we define a new congruence =2- 

Definition 2.6.7. Consider the set of expressions L defined in Definition 
2.6.2. On L we now consider the smallest congruence relation =2 containing 
=1 and satisfying 



(1) {Pi,qi) V {pi,q2) =2 {pi, m\n{qi,q2)), and 
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(2) {pi,qi)\/ {P2,qi) =2 (max(pi,p2),gi), and 

(3) {pi + a, gi + a) V (pi, qi) =2 (pi + a,qi + a), 

where a G [0, 00). We write L2 for the set of equivalence classes and denote 
the projection map L ^ L2 by /2. 



Since =iC=2, this projection maps factors through /i, say g: Li 
the unique map such that g o fi = f2. 



Lo is 



Definition 2.6.8. We define 6 : S x Stat x Path x Path ^ L2 by <5 = /2 o 5. 

Now, in order to prove that T{6) = T{6) we need to define a scalar 
multiplication operator and an addition operator © on L. 

Definition 2.6.9. We define : [0, cxd) x L L and © : L x L L by 

n n 
cQ \/{pi,qi) = \/{c-pi,c- qi) and 



i=l 

m 



i=l 
n m 



1=1 j=l i=lj=l 

Note that and © induce maps 0i : [0, 00) x Li — )• Li and ©1 : Li x Li — )• 
Li as shown in Figure 2.6 below. As before, we omit the subscript 1 if that 
will not cause confusion. 



[0,0c) xL 







LxL 



- L 



idxfi 



h X /i 



[0, 00) X Li 



Figure 2.6: Commutative diagrams 

The following seemingly innocent lemma is readily proven, but it con- 
tains the key to allow us to discard certain pairs of probabilities. The fact 



58 



Chapter 2. Conditional probabilistic temporal logic 



that T induces operations on L2 means that it is correct to "simphfy" ex- 
pressions using =2 when we are interested in the maximum or minimum 
quotient. 

The intuition is as follows. Normally, which decision is best in a cer- 
tain state (or rather, at a certain finite path) to optimize the conditional 
probability, might depend on probabilities or choices in a totally differ- 
ent part of the automaton (see Example 2.6.1). Sometimes, however, it 
is possible to decide locally what decision the scheduler should take. The 
congruence =2 encodes three such cases, each of them corresponding to 
one clause in Definition 2.6.7. (1) If from a state t the scheduler rj can 
either take a transition after which [(p Aip] = pi and [tp] = gi or a 
transition after which P^ [(p Aip] = pi and P^ [■0] = 92 , then in order to 
maximize the conditional probability is always best to take the decision 
where P^['i/'] = rnin(qi,q2)- (2) Similarly, if the scheduler can either take 
a transition after which P^ [(p Aip] = pi and P^ [tp] = q\ or one after which 
P^ [4> Alp] = p2 and P,^ [ip] = qi , then it is always best to take the decision 
where ^^^[(p Aip] = max(pi,p2)- (3) Finally, if r] has the option to either take 
a transition after which W^[(p Aip] = pi+ a and P^ [ip] = qi+ a or one after 
which P^[(/> Aip] = pi and P^['i/'] = for some a > 0, then a maximizing 
scheduler should always take the first of these two options. 

Lemma 2.6.10. The operators 0, ©, and T on L induce operators ©2, ©2, 
and T2 on L2. 

Proof. The idempotence, commutativity and associativity cases are trivial; 
we only treat the other three cases. 

(© ) For (1) we have 



Additionally, note that since q> p and q' > p we have min(g, q') > p. 



cQ{{p,q)V ip,q')) 



A 



{c-p,c-q) V {c-p,c-q') 
(c • p, min(c • q,c ■ q') 
{c-p,c-mm{q,q')) 
cQ{p,mm{q,q')) 



A 
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For (2) the proof goes like in (1). For (3) we have the following 

cQ {{p + a,q + a) y {p,q)) = {c ■ p + c ■ a,c ■ q + c ■ a) V {c ■ p,c ■ q) 

= {c ■ p + c ■ a,c ■ q + c ■ a) 
= cQ {p + a,q + a) 



(0 ) For (1) we have 

= Vi=i{{P + Pi^Q + Qi) V ip + Pi,q' + qi)) 

= Vi=i{p + Pi,<^Mq + qi,q' + qi)) 
= \Jl.=i{p + Pi,"^Mq^q') + qi) 

= {p,m\n{q,q'))®\J1^^{pi,qi) 

For (2) the proof goes like in (1). For (3) we have the following 

{{p + a,q + a) V {p,q)) ®\/1=i{pi,qi) 
- Vr=i(P + a + pi,q + a + qi)y Vr=i(^' + Pi^q + qi) 
= Vr=i(P + a + pi,q + a + qi)y {p+ pi, q' + qi) 
= '^'i=i{p + a. + Pi,q + a + qi) 

= {p + a,q + a)®\/7=iiPi^+qi) 



(T ) For (1) we will start by assuming that q, q' ^ 0. Then 

T((p,g) V(p,(7') VVr=i(Pi>9*)) 
^ max ^{2} u U {||Vi<,K„.g. / 0} U {0}) 

= "^ax({^^}U{||Vi<.<..g./0}U{0}) 

^ T((p,min(<?,g'))vVr=ife>90) 



Now assume that g = 0, g' 7^ and the case g 7^ 0, = is similar. 
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Note that we now have that p = 0. Then 

^ max^{J}U{||Vi<i<„.(7, /0}U{0}) 

= max({||Vi<,<„.(/, /0}U{0}) 

^ T((p,0)Vvr=i(K,9.)) 

= T((p,min((7,(7')) VVr=i(Pi>90) 

Finally, assume that q = q' = 0, then also p = 0, so 

T((p,g) V(p,(/') VVr=i(^'*>%)) = max({||Vi<i<„.<7i/0}U{0}) 

^ T((p,0)vVr=i(Pi,9^)) 

= T((p,min(g,g')) VVr=i(P*''?i)) 

For (2) the proof goes like in (1). For (3) we first need the following. 

Let / : M — )• M be a function defined as f{x) = where a and b 
are constants in the interval (0, 1]. Then / is increasing. Let us now 
assume that q ^ or a ^ 0. Then 

T{{p + a,q + a)y {p, q)) V Vr=i(K' n)) 
^ max U {f } U {||Vi<.<„.g. ^ 0} U {0}) 

= max({2±^}U{||Vi<,<„.(/./0}U{0}) 

{By obervation above about /} 
- T((p + a,gr + a) Vr=i(Pi!9j)) 

Now assume that q = a = 0. Then 

T ((p + a, g + a) V {p, q)) V V"=i(P*, 
^ max({||Vi<,<„.(/i/0}U{0}) 
= T((p + a,0) VVr=i(:Pi>9i)) 

= T((p + a,g + a) VVr=i(K>9i)) □ 
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The fact that T{6) = T{5) foUows from previous lemma. 

Finahy, the fohowing theorem provides recursive equations for the val- 
ues of 6^ and 6^. If the MDPs is acyclic, it can be used to compute these 
values. 



Theorem 2.6.11. LetnbeaMDP,s G 5, and (/>iZ/^(/>2, O^i € Path. 

Then 5^{(t)iU(t)2 \ ^lUi^i) = 



{ (P+ [il^iU^2] , P+ [i^lU^l^2] ) if s N '^'2 , 

(F+[cPiUcP2],l) ifsh-02AV2, 
{0,Fj[iPiUi;2]) if s ^ -0iA-(/.2A-V'2, 

(0,0) if s 1= (/>iA^(/)2A-.'0iA-.V'2, 

V E <t)Q6^{(l)iU(l)2\^PiUtP2)] iis^(j)iA^<l)2A'il^iA^^P2, 

^ 7rgr(s) ytGsucc(s) J 



and 6^{(piU<p2 I D^i) 



f (F+pV^i],P+p^i]) ifsh'/'2, 

(0,0) if s h-</'2A-Vi, 

(0,P-pVi]) if s ^-01 A-.^2 AVi, 



V E ^(i)0 5?(<^i^/<^2 I □V'l) I if s N A -<^2 A ^ 

^ 7reT(s) y tGsucc(s) 



Proof. We will consider the case . We will use ip to denote A -i02 A A 
-1 A ^^2) i-e., the stopping condition of cpCTL formula under consideration. 

(a) Note that if s \= (j)2, then semi HI schedulers are exactly the HI 
schedulers, i.e., Sch^(n) = Sch^'(n). 
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r?gSchr(n) 
r,eSchnn) 

=(P+[V'iZ^V2],P+[V'i^V'2]) {Case (3)} 



= V (P,,^[0iiY02AV'lZ^^2],P.,,[V'l^V'2]) 
r?GSchr(n) 

N ^2} 

= V (P,,^,[0i^02],P.,,[true]) 
r?eSchr(n) 

{Case (2) and definition of P [true]} 

= (P+[0liY02],l) 



= V (lPs,„[</'l^'^2A^iZ^^2],P.,,,[V'l^^^2]) 

7?GSch?(n) 

{s ^ A ^(\>2 A ^^^2} 
= V (P,,^[false],P,_^[V'iZ^^2]) 

r?GSchnn) 

{Case (1) and definition of P[false]} 

=(0,P+[ViZ^^2]) 
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(d) 



r7GSch?(n) 

{Since s \= ^cpi A -i02 A -01 A -^ip2} 

- V (P,,^[false],P,_Jfalse]) = (0,0) 
»7eSch?(n) 



(e) 5^{^iU(l)2 I 

= V (P,_^[,^iZ^,^2AVlZ^V'2],P.,,[V'lZ^V'2]) 

r,eSchr{n) 

= V ( E ^(s)(t)0(Pt,,[</'iZ^'/'2A^iiYV^2],lPt,^[V'iZ^^2]) 
TjeSch^n) \tesucc{s) 

^ Since 11 is acyclic} 

V ( E V (Pi,„J</'l^'/'2AVlWV2],Pt,,j0lWV'2]) 

7rer(s) \iGsucc(s) 77tgSchf(n) 



= V I E 



□ 



From MDPs to Acyclic MDPs 

Now, we show how to reduce a MDP with cycles to an acyclic one, thus 
generalizing our results to MDPs with cycles. For that purpose we first 
reduce all cycles in 11 and create a new acyclic MDP [H] such that the 
probabilities involved in the computation of P^[— j— ] are preserved. We 
do so by removing every strongly connected component (SCC) k of (the 
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graph of) n, keeping only input states and transitions to output states 
(in the spirit of [ADvROS]). We show that P+[-|-] on [n] is equal to 
the corresponding value on 11. For this, we have to make sure that states 
satisfying the stopping condition are ignored when removing SCCs. 

(1) Identifying SCCs. Our first step is to make states satisfying the 
stopping condition absorbing. 

Definition 2.6.12. Let 11 = (5, so,r, L) be a MDP and Lp G Stat a state 
formula. We define a new MDP (n)^ = (S, sq, (t)^, L) where {t)^{s) is 
equal to t{s) if s ^ ip and to 1^ otherwise. 

To recognize cycles in the MDP we define a graph associated to it. 

Definition 2.6.13. Let U = {S, sq, r, L) be MDP and (p G Stat. We define 

the digraph G = Gu^^p = {S, associated to (H)^ = {S, sq, {t)^,L) where 
satisfies u ^ v ^ Bn G {t)^{u).'7t{v) > 0. 

Now we let SCC = SCCn,(p ^ piS) be the set of SCC \ 1 / 

of G. For each SCC k we define the sets Inpk of ^ — Lp^'st.,' — X 



where g is the successor relation defined in Section 2.2. 

We then associate a MDP Ilk to each SCC k of G. The space of states of 

is A; U Outk and the transition relation is induced by the transition relation 



Definition 2.6.14. Let H be a MDP and k G SCC be a SCC in H. We pick 
an arbitrary element s^. of Inp^ and define the MDP = (S'^, s^, r^, L) 
where Sk = {k} U Outk and Tk{s) is equal to {Is} if s G Outk and to r(s) 
otherwise. 



all states in k that have an incoming transition of 
n from a state outside of k; we also define the set 
Outk of all states outside of k that have an incoming 
transition from a state of k. Formally, for each k G 
SCC we define 




Inpk = {uGA;|3sG5 \ {k} such that (s, u) G g}, 
Outk = {s G 5" \ {k} I 3 n G A; such that (n, s) G g}. 



of n. 



2.6. Model Checking cpCTL 



65 



(2) Constructing an acyclic MDP. To obtain a reduced acyclic MDP 
from the original one we first define the probability of reaching one state 
from another according to a given HI scheduler in the following way. 

Definition 2.6.15. Let IT = {S, so,t, L) be a MDP, and 77 be a HI scheduler 
on n. Then for each s,t ^ S we define the function R such that Ru{s ~^ 
t) 4 Fs,n{W e Paths(s) | 3i.uji = t}). 

We note that such reachability values can be efficiently computed using 
steady-state analysis techniques [Cas93]. 

Now we are able to define an acyclic MDP [11] related to 11 such that 

Kn]i-\-]=^i[-\-]- 

Definition 2.6.16. Let n = {S,so,t,L) be a MDP. Then we define [n] 
as ([5] , So, [t] , L) where 

Scorn Sinp 

[S] =5 \ U A: U U ^'^Pk 
fcescc fcescc 

and for all s G [S] the set [t](s) of probabilistic distributions on [S] is given 

by 



[r]is) 



t{s) if s G 5, 



com J 



{A G [S].Ru,^is -^t))\rjG Sch,"'(nfcJ} if s G Sinp. 
Here ks is the SCC associated to s. 

Theorem 2.6.17. Let U = {S, sq, t, L) be a MDP, and P<„[(/>|V'] € cpCTL. 
Then [H] is an acyclic MDP and P+^n['^l^] = [n] [<^l^]rwhere P+n'[-|-] 
represents P+[-|-] on the MDP n'. 

Proof. The proof follows straightforwardly by the construction of [H] and 
Theorem 2.5.11. □ 

Finally we can use the technique for acyclic MDPs on the reduced MDP in 
order to obtain P^q[— |— ]• 
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2.6.2 Complexity 

As mentioned before, when computing maximum or minimum conditional 
probabilities it is not possible to locally optimize. Therefore, it is necessary 
to carry on, for each deterministic and HI scheduler r/, the pair of probabil- 
ities (P^[(/> A V'ljF'r,!^]) from the leafs (states satisfying the stopping condi- 
tion) to the initial state. As the number of HI schedulers in a MDP grows 
exponentially on the state space, our algorithm to verify cpCTL formulas 
has exponential time complexity. 

We believe that the complexity of computing optimal conditional prob- 
abilities is intrinsically exponential, i.e. computing such probabilities is an 
NP problem. However, a deeper study on this direction is still missing. 



Conditional probability bounds Even if computing exact conditional 
probabilities is computationally expensive (exponential time), it is still pos- 
sible to efficiently compute upper and lower bounds for such probabilities 
(polynomial time). 

Observation 2.6.1. Let H be a MDP and two path pCTL formulas. 
Then we have 



2.7 Counterexamples for cpCTL 

Counterexamples in model checking provide important diagnostic informa- 
tion used, among others, for debugging, abstraction-refinement [CGJ"''00], 
and scheduler synthesis [LBB+01]. For systems without probability, a coun- 
terexample typically consists of a path violating the property under consid- 
eration. Counterexamples in MCs are sets of paths. E.g, a counterexample 
for the formula P<^[(/)] is a set A of paths, none satisfying (j), and such that 
the probability mass of A is greater than a [HK07a, ADvROS, AL06]. 

In MDPs, we first have to find the scheduler achieving the optimal 
probability. Both for pCTL and cpCTL, this scheduler can be derived from 
the algorithms computing the optimal probabilities [ADvROS]. Once the 
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optimal scheduler is fixed, the MDP can be turned into a Markov Chain and 
the approaches mentioned before can be used to construct counterexamples 
for pCTL. For cpCTL however, the situation is slightly more complex. It 
follows directly from the semantics that: 

Lemma 2.7.1. Let a € [0,1] and consider the formula P<^[(/)|V']- Let 
= {(x! € Paths I CO \= (j)}, Ai C A^^^, and A2 C A^^. Then a < 
P^(Ai)/(l - Fr,(A2)) implies a < P^[</.|V]. 

Proof. We first note that 

P,,(Ai) < P^(A0A^) and P^(A2) < P,,(A^^,). 
Then, it is easy to see that 

1-P,(A2) - 1-P,(A.^) - P,(A^) 
This leads to the following notion of counterexample. 

Definition 2.7.2. A counterexample for P<^[(^|'(/'] is a pair (Ai,A2) of 
measurable sets of paths satisfying Ai C A,^^^, A2 C A^^,, and a < 
Pr,(Ai)/(l — P^(A2)), for some scheduler r/. 

Note that such sets Ai and A2 can be computed using the techniques on 
Markov Chains mentioned above. 

Example 2.7.3. Consider the evaluation of sq \= F^q j^[0 B\\I\P] on the 
MDP obtained by taking a = ^ in the MDP depictured in Figure 2.1. The 
corresponding MDP is shown in Figure 2.7(a). In this case the maximizing 
scheduler, say rj, chooses 7r2 in 52- In Figure 2.7(b) we show the Markov 
Chain derived from MDP using rj. In this setting we have P^^^ ,j[Oi?|n-P] = 

and consequently sq does not satisfy this formula. 

We show this fact with the notion of counterexample of Definition 2.7.2. 
Note that A^^/^np = (sosi) U (S0S2S3) and A^op = {S0S2S5). Using Lemma 
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2.7.1 with Ai = {sqSi) and A2 = (S0S2S5) we have | < -^^p^'^Az) ^ 1-1/8 ^ 
f . Consequently | < F^^^^[OB\aP], which proves that sq ^ ¥^^^^[OB\aP]. 



Chapter 3 



Computing the Leakage of 
Information Hiding Systems 

In this chapter we address the problem of computing the infor- 
mation leakage of a system in an efficient way. We propose two 
methods: one based on reducing the problem to reachability, and 
the other based on techniques from quantitative counterexample 
generation. The second approach can be used either for exact or 
approximate computation, and provides feedback for debugging. 
These methods can be applied also in the case in which the in- 
put distribution is unknown. We then consider the interactive 
case and we point out that the definition of associated channel 
proposed in literature is not sound. We show however that the 
leakage can .still be defined consistently, and that our methods 
extend smoothly. 

3.1 Introduction 

By information hiding, we refer generally to the problem of constructing 
protocols or programs that protect sensitive information from being de- 
duced by some adversary. In anonymity protocols [CPPOSa], for example, 
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the concern is to design mechanisms to prevent an observer of network 
traffic from deducing who is communicating. In secure information flow 
[SM03], the concern is to prevent programs from leaking their secret input 
to an observer of their pubhc output. Such leakage could be accidental or 
malicious. 

Recently, there has been particular interest in approaching these is- 
sues quantitatively, using concepts of information theory. See for example 
[MNCM03, CHMOSb, DPW06, CMS09, CPPOSa]. The secret input S and 
the observable output O of an information-hiding system are modeled as 
random variables related by a channel matrix, whose (s, o) entry specifies 
P{o\s), the conditional probability of observing output o given input s. 
If we define the vulnerability of S as the probability that the adversary 
could correctly guess the value of S in one try, then it is natural to mea- 
sure the information leakage by comparing the a priori vulnerability of S 
with the a posteriori vulnerability of S after observing O. We consider two 
measures of leakage: additive, which is the difference between the a posteri- 
ori and a priori vulnerabilities; and multiplicative, which is their quotient 
[Smi09, BCP09]. 

We thus view a protocol or program as a noisy channel, and we calculate 
the leakage from the channel matrix and the a priori distribution on S. 
But, given an operational specification of a protocol or program, how do 
we calculate the parameters of the noisy channel: the sets of inputs and 
outputs, the a priori distribution, the channel matrix, and the associated 
leakage? These are the main questions we address in this chapter. We focus 
on probabilistic automata, whose transitions are labeled with probabilities 
and actions, each of which is classified as secret, observable, or internal. 

We first consider the simple case in which the secret inputs take place 
at the beginning of runs, and their probability is fixed. The interpretation 
in terms of noisy channel of this kind of systems is well understood in 
literature. The framework of probabilistic automata, however, allows to 
represent more general situations. Thanks to the nondeterministic choice, 
indeed, we can model the case in which the input distribution is unknown, 
or variable. We show that the definition of channel matrix extends smoothly 
also to this case. Finally, we turn our attention to the interactive scenario 
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in which inputs can occur again after outputs. This case has also been 
considered in literature, and there has been an attempt to define the channel 
matrix in terms of the probabilities of traces [DJGP02]. However it turns 
out that the notion of channel is unsound. Fortunately the leakage is still 
well defined, and it can be obtained in the same way as the simple case. 



We consider two different approaches to computing the channel matrix. 
One uses a system of linear equations as in reachability computations. With 
this system of equations one can compute the joint matrix, the matrix of 
probabilities of observing both s and o; the channel matrix is trivially de- 
rived from this joint matrix. The other approach starts with a channel 
matrix, which we call a partial matrix at this point. We iteratively add the 
contributions in conditional probabilities of complete paths to this partial 
matrix, obtaining, in the limit, the channel matrix itself. We then group 
paths with the same secret and the same observable together using ideas 
from quantitative counterexample generation, namely by using regular ex- 
pressions and strongly connected component analysis. In this way, we can 
add the contribution of (infinitely) many paths at the same time to the 
partial matrices. This second approach also makes it possible to identify 
which parts of a protocol contribute most to the leakage, which is useful 
for debugging. 



Looking ahead, after reviewing some preliminaries (Section 3.2) we 
present restrictions on probabilistic automata to ensure that they have 
well-defined and finite channel matrices (Section 3.3). This is followed by 
the techniques to calculate the channel matrix efficiently (Section 3.4 and 
Section 3.5). We then turn our attention to extensions of our information- 
hiding system model. We use nondeterministic choice to model the situa- 
tion where the a priori distribution on the secret is unknown (Section 3.6). 
Finally, we consider interactive systems, in which secret actions and ob- 
servable actions can be interleaved arbitrarily (Section 3.7). 
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3.2 Preliminaries 

3.2.1 Probabilistic automata 

This section recalls some basic notions on probabilistic automata. More 
details can be found in [Seg95]. A function ^ : Q — >■ [0, 1] is a discrete 
probability distribution on a set Q if the support of /U is countable and 
Yliq&Q l^{l) — 1- "^^^ set of all discrete probability distributions on Q is 
denoted by T>{Q). 

A probabilistic automaton is a quadruple M = (Q, S, a) where Q is 
a countable set of states, S a finite set of actions, q the initial state, and 
a a transition function a : Q pf(T>(Y^ x Q)). Here pf{X) is the set of 
all finite subsets of X. If a{q) = then g is a terminal state. We write 
g— for G a{q), q € Q. Moreover, we write gAr for q,r ^ Q whenever 
q^H and fi{a, r) > 0. A fully probabilistic automaton is a probabilistic 
automaton satisfying < 1 for all states. In case a{q) 7^ we will 

overload notation and use a{q) to denote the distribution outgoing from q. 

A path in a probabilistic automaton is a sequence a = qo ^ qi ^ ■ ■ ■ 
where qi € Q, Ui € T, and qi A^gj+i. A path can be finite in which case it 
ends with a state. A path is complete if it is either infinite or finite ending 
in a terminal state. Given a path a, first((T) denotes its first state, and 
if a is finite then last((T) denotes its last state. A cycle is a path a such 
that last((T) = first((T). We denote the set of actions occurring in a cycle 
as CyclesA(M). Let PathSg(M) denote the set of all paths, Paths*g(M) the 
set of all finite paths, and CPathSq(M) the set of all complete paths of an 
automaton M, starting from the state q. We will omit q\l q = q. Paths are 
ordered by the prefix relation, which we denote by <. The trace of a path is 
the sequence of actions in S* U obtained by removing the states, hence 
for the above a we have trace{a) = 0102 . . .. If S' C S, then traceY:'{cr) is 
the projection of trace{a) on the elements of S'. The length of a finite path 
o", denoted by \a\, is the number of actions in its trace. 

Let M((5,E,g, a) be a (fully) probabilistic automaton, g G Q a state, 
and let a € PathSg(M) be a finite path starting in q. The cone generated by 
a is the set of complete paths (cr) = {a' € CPathSg(M) | a < a'}. Given a 
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fully probabilistic automaton M = (Q, S, q, a) and a state q, we can calcu- 
late the probability value, denoted by 1P<j(ct), of any finite path a starting in q 
as follows: fq{q) = 1 and ¥q{a A q') = Pg(cr) ^i{a,q'), where last(cj) fx. 

Let Qg = CPathSg(M) be the sample space, and let Tq be the smallest 
(T-algebra generated by the cones. Then P induces a unique probability 
measure on J^q (which we will also denote by Pg) such that Pg((cr)) = P<j(c) 
for every finite path a starting in q. For q = q we write P instead of Pg. 

Given a probability space {i},J-,P) and two events A,B F with 
P{B) > 0, the conditional probability of A given P[A \ B), is defined as 
P{Ar\B)/P{B). 

3.2.2 Noisy Channels 

This section briefly recalls the notion of noisy channels from Information 
Theory [CT06]. 

A noisy channel is a tuple C = {X, y, P{-\-)) where X = {xi,X2, • • • , Xn} 
is a finite set of input values, modeling the secrets of the channel, and 
y = {yi,y2, ■ ■ ■ ,ym} is a finite set of output values, the observables of the 
channel. For Xj € and yj £ y, P{yj\xi) is the conditional probability 
of obtaining the output yj given that the input is xi. These conditional 
probabilities constitute the so called channel matrix, where P{yj\xi) is the 
element at the intersection of the i-th row and the j-th column. For any 
input distribution Px on X, Px and the channel matrix determine a joint 
probability Pa on ^ x 3^, and the corresponding marginal probability Py on 
y (and hence a random variable Y). Px is also called a priori distribution 
and it is often denoted by vr. The probability of the input given the output 
is called a posteriori distribution. 

3.2.3 Information leakage 

We recall now some notions of information leakage which allow us to quan- 
tify the probability of success of a one-try attacker, i.e. an attacker that 
tries to obtain the value of the secret in just one guess. In particular, we 
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consider Smith's definition of multiplicative leakage [Smi09]^, and the addi- 
tive leakage definition from Braun et al. [BCP09]. We assume given a noisy 
channel C = 3^, P(-|-)) and a random variable X on X. The a priori 
vulnerability of the secrets in X is the probability of guessing the right se- 
cret, defined as V{X) = maxx^x Px{x)- The rationale behind this definition 
is that the adversary's best bet is on the secret with highest probability. 

The a posteriori vulnerability of the secrets in X is the probability 
of guessing the right secret, after the output has been observed, aver- 
aged over the probabilities of the observables. The formal definition is 
V{X\ Y) = Yliy^y Pyiy) naxxGA' P{x \ y). Again, this definition is based on 
the principle that the adversary will choose the secret with the highest a 
posteriori probability. 

Note that, using Bayes theorem, we can write the a posteriori vulner- 
ability in terms of the channel matrix and the a priori distribution, or in 
terms of the joint probability: 



V{X\Y) = y^mMP{y\x)Px{x)) = ymzxP^{x,y). (3.1) 



The multiplicative leakage is then defined as the quotient between the a 
posteriori and a priori vulnerabilities, Cx{C^Px) — V{X\Y) / V{X). Simi- 
larly, the additive leakage is defined as the difference between both vulner- 
abilities, C+{C,Px) = V{X\Y)- V{X). 

3.3 Information Hiding Systems 

To formally analyze the information-hiding properties of protocols and pro- 
grams, we propose to model them as a particular kind of probabilistic au- 
tomata, which we call Information- Hiding Systems (IHS). Intuitively, an 
IHS is a probabilistic automaton in which the actions are divided in three 

^The notion proposed by Smith in [Smi09] was given in a (equivalent) logarithmic 
form, and called simply leakage. For uniformity's sake we use here the terminology and 
formulation of [BCP09]. 
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(disjoint) categories: those which are supposed to remain secret (to an ex- 
ternal observer), those which are visible, and those which are internal to 
the protocol. 

First we consider only the case in which the choice of the secret takes 
place entirely at the beginning, and is based on a known distribution. Fur- 
thermore we focus on fully probabilistic automata. Later in the chapter we 
will relax these constraints. 

Definition 3.3.1 (Information-Hiding System). An information-hiding sys- 
tem (IHS) is a quadruple X = (M, S5, Eq, E^) where M = {Q, E, q, a) is a 
fully probabilistic automaton, E = E5 U Eq U E,- where E5, E^, and E,- 
are pairwise disjoint sets of secret, observable, and internal actions, and a 
satisfies the following restrictions: 

1. aiq) G P(E5 x Q), 

2. VSGE5 3!g . a(g)(s,g) /O, 

3. a{q) € V{J:o U E^ x Q) for q ^ q, 

4. CyclesA(M) C E^, 

5. P(CPaths(M) n Paths'^(M)) = 1. 

The first two restrictions are on the initial state and mean that only 
secret actions can happen there (1) and each of those actions must have non 
null probability and occur only once (2), Restriction 3 forbids secret actions 
to happen in the rest of the automaton, and Restriction 4 specifies that only 
internal actions can occur inside cycles, this restriction is necessary in order 
to make sure that the channel associated to the IHS has finitely many inputs 
and outputs. Finally, Restriction 5 means that infinite computations have 
probability and therefore we can ignore them. 

We now show how to interpret an IHS as a noisy channel. We call 
tracej:g{a) and traceY.^ia') the secret and observable traces of fi, respec- 
tively. For s G E^, we define [s] = {fj G CPaths(M) | traceY,s{cr) = s}; 
similarly for o G E^, we define [o] = {cr G CPaths(M) | traceY.a{a) = o}. 
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Definition 3.3.2. Given an IHS X = (M, T,^, So, S,-), its noisy channel is 
{S, O, P), where 5 = S5, O = iraces^ (CPaths(M)), and P{o \ s) = P([o] | 
[s]). The a priori distribution vr G ^{S) of I is defined by 7r(s) = a{q){s, •). 
If C is the noisy channel of I, the multiplicative and additive leakage of I 
are naturally defined as 

Cx{1) = Cx{C,7r) and = /3+(C, vr). 

Example 3.3.3. Crowds [RR98] is a well-known anonymity protocol, in 
which a user (called the initiator) wants to send a message to a 
web server without revealing his identity. To 
achieve this, he routes the message through a 
crowd of users participating in the protocol. 
Routing is as follows. In the beginning, the 
initiator randomly selects a user (called a for- 
warder), possibly himself, and forwards the re- 
quest to him. A forwarder performs a probabilis- 
tic choice. With probability p (a parameter of 
the protocol) he selects a new user and again for- 
wards the message. With probability 1 — p he 
sends the message directly to the server. One 
or more users can be corrupted and collaborate 
with each other to try to find the identity of the 
initiator. 

We now show how to model Crowds as an IHS for 2 honest and 1 
corrupted user. We assume that the corrupted user immediately forwards 
messages to the server, as there is no further information to be gained for 
him by bouncing the message back. 

Figure 3.1 shows the automaton^. Actions a and b are secret and repre- 
sent who initiates the protocol; actions A, B, and U are observable; A and 
B represent who forwards the message to the corrupted user; U represents 
the fact that the message arrives at the server undetected by the corrupted 




Figure 3.1: Crowds Pro- 
tocol 



^For the sake of simplicity, we allow the initiator of the protocol to send the message 
to the server also in the first step of the protocol. 
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user. We assume U to be observable to represent the possibility that the 
message is made publically available at the server's site. 

The channel associated to this IHS has S = {a, b}, O = {A, B, U}, and 
a priori distribution 7r(a) = |,7r(6) = |. Its channel matrix is computed in 
the next section. 

3.4 Reachability analysis approach 

This section presents a method to compute the matrix of joint probabilities 
/a associated to an IHS, defined as 

Pa(s, o) = n [o]) for aU s G 5 and o € C 

We omit the subscript A when no confusion arises. From P/\ we can de- 
rive the channel matrix by dividing //x(s,o) by tt{s). The leakage can be 
computed directly from Pf^, using the second form of the a posteriori vul- 
nerability in (4.1). 

We write for the probability of the set of paths with trace A G 
(S5 U So)* starting from the state q of M: 

x^ = ff^.([A],), 

where [X]g = {a G CPathSg(M) | trace2_5USo (c) = A}. The following key 
lemma shows the linear relation between the x^'s. We assume, w.l.o.g., 
that the IHS has a unique final state qj. 

Lemma 3.4.1. Let I = (M, S5, So, S^) be an IHS. For ah A e (E^UEc))* 
and q £ Q we have 

= 1 

Xg^ =0 for A / e, 

= EheSr Eg'gsucc(g) a(9)(^> q') ' ^'q' ^r q ^ qf, 

= Eg'Gsucc{g)"(9)(firSt(A),g') 

+ Eftes, q') ■ Xg' for A 7^ e and q ^ qf. 

Furthermore, for s G 5 and o G O we have P([s] H [o]) = 
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Using this lemma, one can compute joint probabilities by solving the 
system of linear equations in the variables x^'s. It is possible that the 
system has multiple solutions; in that case the required solution is the 
minimal one. 

Example 3.4.2. Continuing with the Crowds example, we show how to 
compute joint probabilities. Note that qj = S. The linear equations from 
Lemma 3.4.1 are 
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Let us fix p = 0.9. By solving the system of linear equations we obtain 
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We can now compute the channel matrix by dividing each x^°jj by tt{s). 
The result is shown in the figure above. 

3.4.1 Complexity Analysis 

We now analyze the computational complexity for the computation of the 
channel matrix of a simple IHS. Note that the only variables (from the 
system of equations in Lemma 3.4.1) that are relevant for the computation 
of the channel matrix are those x^ for which it is possible to get the trace 
A starting from state q. As a rough overestimate, for each state q, there 
are at most |5| • \0\ A's possible: in the initial state one can have every 
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secret and every observable, in the other states no secret is possible and 
only a suffix of an observable can occur. This gives at most \Q\ ■ \S\ ■ \0\ 
variables. Therefore, we can straightforwardly obtain the desired set of 
values in 0((|(5| • \S\ ■ \0\)^) time (using Gaussian Elimination). Note that 
using Strassen's methods the exponent reduces to 2.807, this consideration 
applies to similar results in the rest of the chapter as well. 

Because secret actions can happen only at the beginning, the system of 
equations has a special form. The variables of the form a;?° only depend 
on variables of the form x° (with varying o and q ^ q) and not on each 
other. Hence, we can first solve for all variables of the form x° and then 
compute the remaining few of the form Required time for the first step 
is ©((jOI • IQI)^) and the time for the second step can be ignored. 

Finally, in some cases not only do the secret actions happen only at 
the beginning of the protocol, but the observable actions happen only at 
the end of the protocol, i.e., after taking a transition with an observable 
action, the protocol only performs internal actions (this is, for instance, 
the case for our model of Crowds). In this case, one might as well enter a 
unique terminal state qf after an observable action happens. Then the only 
relevant variables are of the form x^°, x°, and x^y, the x^° only depends on 
the x°, the x° only depend on x°, (with the same o, but varying q's) and on 
Xq^ and Xg^. = 1. Again ignoring the variables for complexity purposes, 
the system of equations has a block form with \0\ blocks of (at most) \Q\ 
variables each. Hence the complexity in this case decreases to 0(|0| • |<5j^)- 

3.5 The Iterative Approach 

We now propose a different approach to compute channel matrices and 
leakage. The idea is to iteratively construct the channel matrix of a system 
by adding probabilities of sets of paths containing paths with the same 
observable trace o and secret trace s to the (o|s) entry of the matrix. 

One reason for this approach is that it allows us to borrow techniques 
from quantitative counterexample generation. This includes the possibility 
of using or extending counterexample generation tools to compute channel 
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matrices or leakage. Another reason for this approach is the relationship 
with debugging. If a (specification of a) system has a high leakage, the iter- 
ative approach allows us to determine which parts of the system contribute 
most to the high leakage, possibly pointing out flaws of the protocol. Fi- 
nally, if the system under consideration is very large, the iterative approach 
allows us to only approximate the leakage (by not considering all paths, but 
only the most relevant ones) under strict guarantees about the accuracy of 
the approximation. We will focus on the multiplicative leakage; similar 
results can be obtained for the additive case. 



3.5.1 Partial matrices 

We start by defining a sequence of matrices converging to the channel ma- 
trix by adding the probability of complete paths one by one. We also define 
partial version of the a posteriori vulnerability and the leakage. Later, we 
show how to use techniques from quantitative counterexample generation 
to add probabilities of many (maybe infinitely many) complete paths all at 
once. 

Definition 3.5.1. Let X = (M, S5, S©, S,-) be an IHS, vr its a priori dis- 
tribution, and o"!, (T2, . . . an enumeration of the set of complete paths of M. 
We define the partial matrices P*^ : 5 x O — > [0, 1] as follows 



iPHo\s) + ^^^^ if trace^„iak+i) = o 

and traceY;s{crk+i) = s, 
^P'^(o|s) otherwise. 



We define the partial vulnerability V^q as maxg P^{o\s) ■ vr(s), and the 
partial multiplicative leakage C-^iX) as V^^/max^ 7r(s). 

The following lemma states that partial matrices, a posteriori vulnera- 
bility, and leakage converge to the correct values. 



Lemma 3.5.2. Let I = (M, S5, So, S^) be an IHS. Then 
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1. P''{o\s) < P''+^{o\s), and lirrifc^oo ^''(o|s) = P{o\s), 

2. < ^to^ lim,.^oo V^lo = V{S\0), 

3. C't.il) < Ci+\I), and limfe^oo /:^x (^) = ^x(2:). 

Since rows must sum up to 1, this technique allow us to compute ma- 
trices up to given error e. We now show how to estimate the error in the 
approximation of the multiplicative leakage. 

Proposition 3.5.1. Let (M, S5, S^), S,-) be an IHS. Then we have 

\s\ 

4(x)</:><(z)<4(i) + ^(i-pf), 

i=l 

where denotes the mass probability of the i-th row of , i.e. p^ = 

T.oP\0\s.). 

3.5.2 On the computation of partial matrices. 

After showing how partial matrices can be used to approximate channel 
matrices and leakage we now turn our attention to accelerating the con- 
vergence. Adding most likely paths first is an obvious way to increase the 
convergence rate. However, since automata with cycles have infinitely many 
paths, this (still) gives an infinite amount of path to process. Processing 
many paths at once (all having the same observable and secret trace) tack- 
les both issues at the same time: it increases the rate of convergence and 
can deal with infinitely many paths at the same time. 

Interestingly enough, these issues also appear in quantitative counterex- 
ample generation. In that area, several techniques have already been pro- 
vided to meet the challenges; we show how to apply those techniques in 
the current context. We consider two techniques: one is to group paths to- 
gether using regular expressions, the other is to group paths together using 
strongly connected component analysis. 
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Regular expressions. In [Daw05], regular expressions containing prob- 
ability values are used to reason about traces in Markov Chains. This idea 
is used in [DHK08] in the context of counterexample generation to group 
together paths with the same observable behaviour. The regular expression 
there are over pairs (p, q) with p a probability value and q a state, to be able 
to track both probabilities and observables. We now use the same idea to 
group together paths with the same secret action and the same observable 
actions. 

We consider regular expressions over triples of the form (a, p, q) with 
p S [0, 1] a probability value, a £ S an action label and g G Q a state. 
Regular expressions represent sets of paths as in [DHK08]. We also take 
the probability value of such a regular expression from that article. 

Definition 3.5.3. The function val : TZ{Tj) — )• M evaluates regular expres- 
sions: 

val{e) = 1, val{r ■ r') = val{r) x val{r'), 

val{{a,p,q)) = p, val{r*) =1 if val{r) = 1, 

val{r-\-r') = val{r) + val(r'), val{r*) = if val{r) ^ 1. 

The idea is to obtain regular expressions representing sets of paths of 
M, each regular expression will contribute in the approximation of the 
channel matrix and leakage. Several algorithms to translate automata into 
regular expressions have been proposed (see [NcuOS]). Finally, each term 
of the regular expression obtained can be processed separately by adding 
the corresponding probabilities [Daw05] to the partial matrix. 

As mentioned before, all paths represented by the regular expression 
should have the same observable and secret trace in order to be able to add 
its probability to a single element of the matrix. To ensure that condition 
we request the regular expression to be normal, i.e., of the form ri -|- • • • -|- r„ 
with the Ti containing no -|-'s. 

We will now describe this approach by an example. 



Example 3.5.4. We used JFLAP 7.0 [JFL] to obtain the regular expression 
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r = ri + r2 + • • • + rio equivalent to the automaton in Figure 3.1. 
ri ^ (6, |, -(S, 0.3, corr).(r, 1,5), 

r2 ^ (6, |, • P • (r,0.3, gj • (r,0.3, gj* • (^,0.3, corr) ■ (r, 1, 5), 
^^3 = (a, |,ga) • ("^^ 0-3, ^a)*- (^,0.3, corr) • (r, 1,5), 
r-4 = (6, |,%) 0.1, 5), 

rs ^ (a, i, • (r, 0.3, gj* • (r, 0.3, gb)-f*- (5,0.3, corr) -(r, 1,5), 
r-6 = (6,|,g;,)-P-(T,0.3,gJ-(r,0.3,gJ*-(?7,0.1,5), 
r-7 = (a,i,gJ-(r,0.3,gJ*-(C/,0.1,5), 

rg = (a, i, g„) • (r, 0.3, g„)* • (r, 0.3, g^) • f* ■ (r, 0.3, g„) • (r, 0.3, g„)*- 

(A, 0.3, corr) ■ (r,l,5), 
rg ^ (a,i,gJ-(r,0.3,gJ*-(T,0.3,g,)-P-(C/,0.1,5), 

rio = (a, i, gj • (r, 0.3, gj*- (r, 0.3, q,)-r* ■ (r, 0.3, gj • (r, 0.3, gj* • {U, 0.1,5), 

where f = ((r, 0.3, g^)* • ((r, 0.3, g„) • (r, 0.3, g„)* • (r, 0.3, g^))*). We also note 

val{ri) = ^ {b,B), val{r2) = ^ {h,A), valir^) = i {a, A), 

val{r4) = ^ (6, [/) val{r^) =! ^ (a, 5), uaZ(r6) = ^ C/), 

m/(r7) = ^ (a, [/), ^/(rs) = {a, A) val{rg) = ^ (a, [/), 

m/(rio) = («>^)- 

where the symbols between brackets denote the secret and observable traces 
of each regular expression. 

Now we have all the ingredients needed to define partial matrices using 
regular expressions. 

Definition 3.5.5. Let I = (M, S5, Sq, S,-) be an IHS, vr its a priori dis- 
tribution, and r = ri + r2 + • • • + r„ a regular expression equivalent to M in 
normal form. We define for A: = 0, 1, . . . , n the matrices : 5 x O — > [0,1] 
as follows 
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if A; = 0, 

P''-\o\s) + if A; ^ and tracej:^ (r^) = o 

and tracej]g{rk) = s, 
P''~^{o\s) otherwise. 

Note that in the context of Definition 3.5.5, we have = P. 

sec analysis approach. In [ADvROS], paths that only differ in the way 
they traverse strongly connected components (SCC's) are grouped together. 
Note that in our case, such paths have the same secret and observable 
trace since secret and observable actions cannot occur on cycles. Follow- 
ing [ADvROS], we first abstract away the SCC's, leaving only probabilistic 
transitions that go immediately from an entry point of the SCC to an exit 
point (called input and output states in [ADvROS]). This abstraction hap- 
pens in such a way that the observable behaviour of the automaton does 
not change. 

Instead of going into technical details (which also involves translating 
the work [ADvROS] from Markov Chains to fully probabilistic automata), 
we describe the technique by an example. 

Example 3.5.6. Figure 3.2 shows the automaton obtained after abstract- 
ing SCC. In the following we show the set of complete paths of the automa- 
ton, together with their corresponding probabilities and traces 

P(ai) = ^, {a, A), 
F(fT2) = ^, {b,B), 

F(^4) = i {b,U), 
n<^5) = i, {a,B), 
P(a6) = i, {b,A). 



P''{o\s) = < 



A ... a A T , o 

o"! = imt — > — > corr — > b, 

A . ., b B , c 

£72 = mit — > Qf, — > corr — > b, 

A ... a U ri 

(73 = imt — > — > b, 

A . ., b U ri 

(74 = imt — > qf, — > b, 

(75 = imt — > q^ — > corr — > b, 

A . ., b A , c 

£76 = imt — > — > corr — > b, 
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Note that the SCC analysis approach groups more 
paths together (for instance ai group together the same 
paths than the regular expressions and rg in the ex- 
amples of this section) , as a result channel matrix and 
leakage are obtained faster. On the other hand, reg- 
ular expressions are more informative providing more 
precise feedback. 

3.5.3 Identifying high-leakage sources 

We now describe how to use the techniques presented 
in this section to identify sources of high leakage of the 
system. Remember that the a posteriori vulnerability 
can be expressed in terms of joint probabilities 

V{S I O) = ^maxP([s] n [o]). 

o 

This suggests that, in case we want to identify parts of the system generat- 
ing high leakage, we should look at the sets of paths [oi] n [si], . . . , [on] H [s„] 
where {oi, . . . On} = O and Si G arg (maxsP([oj] n [s])). In fact, the multi- 
plicative leakage is given dividing V{S \ O) by V{S), but since V{S) is a 
constant value (i.e., it does not depend on the row) it does not play a role 
here. Similarly for the additive case. 

The techniques presented in this section allow us to obtain such sets 
and, furthermore, to partition them in a convenient way with the purpose 
of identifying states/parts of the system that contribute the most to its 
high probability. Indeed, this is the aim of the counterexample generation 
techniques previously presented. For further details on how to debug sets 
of paths and why these techniques meet that purpose we refer to [AL08, 
DHK08, ADvROS]. 

Example 3.5.7. To illustrate these ideas, consider the path cji of the 
previous example; this path has maximum probability for the observable 
A. By inspecting the path we find the transition with high probability 




Figure 3.2: 
Crowds after the 
SCC analysis 



86 Chapter 3. Computing the Leakage of Information Hiding Systems 



corr. This suggests to the debugger that the corrupted user has an 
excessively high probabihty of intercepting a message from user a in case 
he is the initiator. 

In case the debugger requires further information on how corrupted 
users can intercept messages, the regular expression approach provides 
further /more-detailed information. For instance, we obtain further infor- 
mation by looking at regular expressions and rg instead of path ai (in 
particular it is possible to visualize the different ways the corrupted user can 
intercept the message of user a when he is the generator of the message). 

3.6 Information Hiding Systems with Variable a 
Priori 

In Section 3.3 we introduced a notion of IHS in which the distribution over 
secrets is fixed. However, when reasoning about security protocols this is 
often not the case. In general we may assume that an adversary knows the 
distribution over secrets in each particular instance, but the protocol should 
not depend on it. In such scenario we want the protocol to be secure, i.e. 
ensuring low enough leakage, for every possible distribution over secrets. 
This leads to the definition of maximum leakage. 

Definition 3.6.1 ([Smi09, BCP09]). Given a noisy channel C = {S,0,P), 
we define the maximum multiplicative and additive leakage (respectively) 
as 

MCx(C)= max Cy:(C,Tr), and MC+(C) = max C+(C,7r). 

In order to model this new scenario where the distribution over secrets may 
change, the selection of the secret is modeled as nondeterministic choice. In 
this way such a distribution remains undefined in the protocol/automaton. 
We still assume that the choice of the secret happens at the beginning, and 
that we have only one secret per run. We call such automaton an IHS with 
variable a priori. 
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Definition 3.6.2. An IHS with variable a priori is a quadruple X = (M, 
E^jSojS,-) where M = q) is a probabilistic automaton, S = 

U So U St- where S5, So, and S,- are pairwise disjoint sets of secret, 
observable, and internal actions, and a satisfies the following restrictions: 

1. a(si) C P(S5 X Q), 

2. = |5| A Vs € S5 . 3 g . iT{s,q) = 1, for some vr G a((7), 

3. a{q) C ^(So U S^ x Q) and \a{q)\ < 1, for ah q q, 

4. Va G (S5 U So) . a CyclesA(M), 

5. Vg,s V7rGa(g) . (7r(s,g) = 1 ^ P(CPathSg(Af) n Paths;(M)) = 1). 

Restrictions 1, 2 and 3 imply that the secret choice is non deterministic 
and happens only at the beginning. Additionally, 3 means that all the other 
choices are probabilistic. Restriction 4 ensures that the channel associated 
to the IHS has finitely many inputs and outputs. Finally, 5 implies that, 
after we have chosen a secret, every computation terminates except for a 
set with null probability. 

Given an IHS with variable a priori, by fixing the a priori distribution 
we can obtain a standard IHS in the obvious way: 

Definition 3.6.3. Let X= ((Q, S, (7, q), S5, So, S,-) be an IHS with vari- 
able a priori and vr a distribution over S. We define the IHS associated to 
(X, vr) as = ((Q, S, (7, a'), S5, So, S,-) with a'{q) = a{q) for all q q 
and a'{q){s, •) = vr(s). 

The following result says that the conditional probabilities associated 
to an IHS with variable a priori are invariant with respect to the a priori 
distribution. This is fundamental in order to interpret the IHS as a channel. 

Proposition 3.6.1. Let I be an IHS with variable a priori. Then for all 
vr, vr' € X'(5) such that 7r(s) 7^ and vr'(s) 7^ for all s & S we have that 
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Proof. The secret s appears only once in the tree and only at the beginning 
of paths, hence P([s] n [a]) = a'{q){s, ■)\{[o]) and P([s]) = a'{q){s,-). 
Therefore P([o] | [s]) = ¥q^{[o]), where Qs is the state after performing s. 
While a'{q){s, •) is different in and 1-,^', ^^(.[0]) is the same, because it 
only depends on the parts of the paths after the choice of the secret. □ 

Note that, although in the previous proposition we exclude input dis- 
tributions with zeros, the concepts of vulnerability and leakage also make 
sense for these distributions'^. 

This result implies that we can define the channel matrix of an IHS 
I with variable a priori as the channel matrix of X,r for any vr, and we 
can compute it, or approximate it, using the same techniques of previous 
sections. Similarly we can compute or approximate the leakage for any 
given vr. 

We now turn the attention to the computation of the maximum leakage. 
The following result from the literature is crucial for our purposes. 

Proposition 3.6.2 ([BCP09]). Given a channel C, we have argmax^^x>{S) 
Cx{C,tt) is the uniform distribution, and arg max^g-p(5) 'C+(C,7r) is a corner 
point distribution, i.e. a distribution vr such that 7r(s) = ^ on k elements 
of S, and 7r(s) = on all the other elements. 

As an obvious consequence, we obtain: 

Corollary 3.6.3. Given an IHS I with variable a priori, we have MCx {!) = 
£x(^7r), where vr is the uniform distribution, and A4C+{I) = 
where vr' is a corner point distribution. 

Corollary 3.6.3 gives us a method to compute the maxima leakages of 
I. In the multiplicative case the complexity is the same as for computing 
the leakage^. In the additive case we need to find the right corner point, 

^We assume that conditional probabilities are extended by continuity on such distri- 
butions. 

^Actually we can compute it even faster using an observation from [Smi09] which says 
that the leakage on the uniform distribution can be obtained simply by summing up the 
maximum elements of each column of the channel matrix. 
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which can be done by computing the leakages for all corner points and then 
comparing them. This method has exponential complexity (in |5|) as the 
size of the set of corner points is 2^^^ — 1. We conjecture that this complexity 
is intrinsic, i.e. that the problem is NP-hard^. 

3.7 Interactive Information Hiding Systems 

We now consider extending the framework to interactive systems, namely to 
IHS's in which the secrets and the observables can alternate in an arbitrary 
way. The secret part of a run is then an element of SJ, like the observable 
part is an element of S^. The idea is that such system models an interactive 
play between a source of secret information, and a protocol or program that 
may produce, each time, some observable in response. Since each choice is 
associated to one player of this "game" , it seems natural to impose that in 
a choice the actions are either secret or observable/hidden, but not both. 

The main novelty and challenge of this extension is that part of the 
secrets come after observable events, and may depend on them. 

Definition 3.7.1. Interactive IHS's are defined as IHS's (Definition 3.3.1), 
except that Restrictions 1 to 3 are replaced by a{q) G ^^(S^ x Q)U P(S — 
X Q). 

Example 3.7.2. Consider an Ebay-like auction protocol with one seller 
and two possible buyers, one rich and one poor. The seller first publishes 
the item he wants to sell, which can be either cheap or expensive. Then the 
two buyers start bidding. At the end, the seller looks at the profile of the 
bid winner and decides whether to sell the item or cancel the transaction. 
Figure 3.7 illustrates the automaton representing the protocol, for certain 
given probability distributions. 



^ Since the publication of the article related to this chapter we have proved that our 
conjecture is true. The proof will appear, together with other results, in an extended 
version of the article 
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We assume that the identities of the 
buyers are secret, while the price of the 
item and the seller's decision are observ- 
able. We ignore for simplicity the in- 
ternal actions which are performed dur- 
ing the bidding phase. Hence So = 
{cheap, expensive, sell, cancel}, T,r = 0, 
5 = = {poor, rich}, and O = 

{cheap, expensive} x {sell, cancel}. The distributions on S and O are 

defined as usual. For instance we have F{[cheap sell]) 




Figure 3.3: Ebay Protocol 
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sell 



cheap rich 

97, 90 — > qi — > 93 
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97}) 
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5^3 



13 
25- 



Let us now consider how to model the protocol in terms of a noisy 
channel. It would seem natural to define the channel associated to the 



P(MnM) 



This 



protocol as the triple (5, O, P) where P(o | s) = P([o] | [s]) 
is, indeed, the approach taken in [DJGP02]. For instance, with the protocol 
of Example 3.7.2, we would have: 



However, it turns out that in the interactive case (in particular when the 
secrets are not in the initial phase), it does not make sense to model the 
protocol in terms of a channel. At least, not a channel with input 5. In 
fact, the matrix of a channel is supposed to be invariant with respect to 
the input distribution (like in the case of the IHS's with variable a priori 
considered in previous section), and this is not the case here. The following 
is a counterexample. 



Example 3.7.3. Consider the same protocol as in Example 3.7.2, but 
assume now that the distribution over the choice of the buyer is uniform, i.e. 
a (91) (poor, gs) = a{qi){rich,qi) = a{q2){poor,q^) = a{q2){rich,qQ) = \. 
Then the conditional probabilities are different than those for Example 
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3.7.2. In particular, in contrast to (3.2), we have 
nicheap sell] \ [poor]) = ^^""^^ ^ ^^^^"^ ^^"^^ - ^ ' M 



H[poor]) 2.1 + 1.1 15' 

The above observation, i.e. the fact that the conditional probabilities 
depend on the input distribution, makes it unsound to reason about cer- 
tain information-theoretic concepts in the standard way. For instance, the 
capacity is defined as the maximum mutual information over all possible in- 
put distributions, and the traditional algorithms to compute it are based on 
the assumption that the channel matrix remains the same while the input 
distribution variates. This does not make sense anymore in the interactive 
setting. 

However, when the input distribution is fixed, the matrix of the joint 
probabilities is well defined as P^{s, o) = P([s] H [o]), and can be computed 
or approximated using the same methods as for simple IHS's. The a priori 
probability and the channel matrix can then be derived in the standard 
way: 

vr(.) = E^A(.,o), P{o\s) = ?^. 

^-^ vrls 

o ^ ' 

Thanks to the formulation (4.1) of the a posteriori vulnerability, the 
leakage can be computed directly using the joint probabilities. 

Example 3.7.4. Consider the Ebay protocol X presented in Example 3.7.2. 
The matrix of the joint probabilities Pf\{s, o) is: 





cheap sell 


cheap cancel 


expensive sell 


expensive cancel 


poor 


8 

25 


2 

25 


1 

25 


2 

75 


rich 


1 

5 


1 

15 


19 
75 


1 

75 



Furthermore TT{poor) = and iT{rich) = Hence we have jCx{X) = |g 

75- 



and C+{I) - " 



We note that our techniques to compute channel matrices and leakage 
extend smoothly to the case where secrets are not required to happen at the 
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beginning. However, no assumptions can be made about the occurrences 
of secrets (they do not need to occur at the beginning anymore). This 
increases the complexity of the reachabiUty technique to 0((|5|-|0|-|(5|)^). 
On the other hand, complexity bounds for the iterative approach remain 
the same. 



3.8 Related Work 



To the best of our knowledge, this is the first work dealing with the effi- 
cient computation of channel matrices and leakage. However, for the simple 
scenario, channel matrices can be computed using standard model check- 
ing techniques. Chatzikokolakis et al. [CPPOSa] have used Prism [PRI] to 
model Crowds as a Markov Chain and compute its channel matrix. Each 
conditional probability P{o\s) is computed as the probability of reaching 
a state where o holds starting from the state where s holds. Since for the 
simple version of IHS's secrets occur only once and before observables (as 
in Crowds), such a reachability probability equals P{o\s). This procedure 
leads to 0(|5| • \0\ ■ \Q\^) time complexity to compute the channel matrix, 
where Q is the space state of the Markov Chain. 

Note that the complexity is expressed in terms of the space state of a 
Markov Chain instead of automaton. Since Markov Chains do not carry 
information in transitions they have a larger state space than an equivalent 
automaton. Figure 3.4 illustrates this: to model 
the automaton (left hand side) we need to en- 
code the information in its transitions into states 
of the Markov Chain (right hand side). There- 
fore, the probability of seeing observation a and 
then c in the automaton can be computed as the 
probability of reaching the state ac. The Markov 
Chain used for modeling Crowds (in our two hon- 
est and one corrupted user configuration) has 27 
states. 

For this reason we conjecture that our complexity 0(|C 




Figure 3.4: Automaton 
vs Markov Chain 
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considerable improvement over the one on Markov Chains 0(|iS|-|0|-|(5p). 

With respect to the interactive scenario, standard model checking tech- 
niques do not extend because multiple occurrences of the same secret are 
allowed (for instance in our Ebay example, P{cheap sell\rich) cannot be 
derived from reachability probabilities from the two different states of the 
automaton where rich holds). 



Chapter 4 



Information Hiding in 
Probabilistic Concurrent 
Systems 

In this chapter we study the problem of information hiding in 
systems characterized by the coexistence of randomization and 
concurrency. Anonymity and Information Flow are examples 
of this notion. It is well known that the presence of nondeter- 
minism, due to the possible interleavings and interactions of the 
parallel components, can cause unintended information leaks. 
The most established approach to solve this problem is to fix the 
strategy of the scheduler beforehand. In this work, we propose 
a milder restriction on the schedulers, and we define the notion 
of strong (probabilistic) information hiding under various no- 
tions of observables. Furthermore, we propose a method, based 
on the notion of automorphism, to verify that a system satisfies 
the property of strong information hiding, namely strong ano- 
nymity or non-interference, depending on the context. Through 
the chapter, we use the canonical example of the Dining Cryp- 
tographers to illustrate our ideas and techniques. 



95 



96 Chapter 4. Information Hiding in Probabilistic Concurrent Systems 



4.1 Introduction 

The problem of information hiding consists in trying to prevent the adver- 
sary to infer confidential information from the observables. Instances of 
this issue are Anonymity and Information Flow. In both fields there is a 
growing interest in the quantitative aspects of the problem, see for instance 
[HO05, BP05, ZB05, CHMOSa, CHMOSb, Mal07, MC08, BCP08, CMS09, 
CPPOSa, CPPOSb, Smi09]. This is justified by the fact that often we have 
some a priori knowledge about the likelihood of the various secrets (which 
we can usually express in terms of a probability distribution), and by the 
fact that protocols often use randomized actions to obfuscate the link be- 
tween secret and observable, like in the case of the anonymity protocols of 
DC Nets [Cha88], Crowds [RR98], Onion Routing [SGR97], and Freenet 
[CSWHOO]. 

In a concurrent setting, like in the case of multi-agent systems, there 
is also another source of uncertainty, which derives from the fact that the 
various entities may interleave and interact in ways that are usually un- 
predictable, either because they depend on factors that are too complex to 
analyze, or because (in the case of specifications) they are implementation- 
dependent. 

The formal analysis of systems which exhibit probabilistic and nonde- 
terministic behavior usually involves the use of so-called schedulers, which 
are functions that, for each path, select only one possible (probabilistic) 
transition, thus delivering a purely probabilistic execution tree, where each 
event has a precise probability. 

In the area of security, there is the problem that secret choices, like all 
choices, give rise to different paths. On the other hand, the decision of the 
scheduler may influence the observable behavior of the system. Therefore 
the security properties are usually violated if we admit as schedulers all 
possible functions of the paths: certain schedulers induce a dependence of 
the observables on the secrets, and protocols which would not leak secret 
information when running in "real" systems (where the scheduling devices 
cannot "see" the internal secrets of the components and therefore cannot 
depend on them), do leak secret information under this more permissive 
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notion of scheduler. This is a weh known problem for which various solu- 
tions have already been proposed [CCK+06a, CCK+06b, CPIO, CNP09]. 
We will come back to these in the "Related work" section. 

4.1.1 Contribution 

We now list the main contribution of this chapter: 

• We define a class of partial-information schedulers (which we call ad- 
missible), schedulers in this class are a restricted version of standard 
(full-information) schedulers. The restriction is rather flexible and has 
strong structural properties, thus facilitating the reasoning about se- 
curity properties. In short, our systems consist of parallel components 
with certain restrictions on the secret choices and nondeterministic 
choices. The scheduler selects the next component (or components, 
in case of synchronization) for the subsequent step independently of 
the secret choices. We then formalize the notion of quantitative in- 
formation flow, or degree of anonymity, using this restricted notion 
of scheduler. 

• We propose alternative definitions to the property of strong anony- 
mity defined in [BP05]. Our proposal differs from the original defi- 
nition in two aspects: (1) the system should be strongly anonymous 
for all admissible schedulers instead of all schedulers (which is a very 
strong condition, never satisfied in practice), (2) we consider several 
variants of adversaries, namely (in increasing level of power): external 
adversaries, internal adversaries, and adversaries in collusion with the 
scheduler (in a Dolev-Yao fashion). Additionally, we use admissible 
schedulers to extend the notions of multiplicative and additive leak- 
age (proposed in [Smi09] and [BCP09] respectively) to the case of a 
concurrent system. 

• We propose a sufficient technique to prove probabilistic strong ano- 
nymity, and probabilistic noninterference, based on automorphisms. 
The idea is the following: In the purely nondeterministic setting, the 
strong anonymity of a system is often proved (or defined) as follows: 
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take two users A and B and a trace in which user A is 'the culprit'. 
Now find a trace that looks the same to the adversary, but in which 
user B is 'the culprit' [HO05, GHvRP05, MVdV04, HK07c]. This 
new trace is often most easily obtained by switching the behavior of 
A and B. Non-interference can be proved in the same way (where A 
and B are high information and the trace is the low information). 

In this work, we make this technique explicit for anonymity in systems 
where probability and nondeterminism coexist, and we need to cope 
with the restrictions on the schedulers. We formalize the notion of 
switching behaviors by using automorphism (it is possible to switch 
the behavior of A and B if there exist an automorphism between 
them) and then show that the existence of an automorphism implies 
strong anonymity. 

• We illustrate the problem with full-information schedulers in security, 
our solution providing admissible schedulers, and the application of 
our prove technique by means of the well known Dining Cryptogra- 
phers anonymity protocol. 

4.2 Preliminaries 

In this section we gather preliminary notions and results related to prob- 
abilistic automata [SL95, Seg95], information theory [CT06], and informa- 
tion leakage [Smi09, BCP09]. 

4.2.1 Probabilistic automata 

A function fi: Q [0,1] is a discrete probability distribution on a set Q 
if "^q^Q fJ-ig) = 1- The set of all discrete probability distributions on Q is 
denoted by V^Q). 

A probabilistic automaton is a quadruple M = (Q, Ti,q,9) where Q is a 
countable set of states, S a finite set of actions, q the initial state, and 9 
a transition function 9 : Q ^ V{'D{Ti x Q)). Here V{X) is the set of all 
subsets of X. 
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If 9{q) = 0, then q is a terminal state. We write q^fJ. for /x G S{q), q G 
Q. Moreover, we write gAr for q,r ^ Q whenever g— >/U and /i(a, r) > 0. A 
/?///?/ probabilistic automaton is a probabihstic automaton satisfying \0{q)\ < 
1 for all states. In case 6{q) 7^ in a fully probabilistic automaton, we will 
overload notation and use 6{q) to denote the distribution outgoing from q. 
A path in a probabilistic automaton is a sequence a = qo ^ qi ^ ■ ■ ■ where 
qi € Q, tti T, and qi A^gj+i. A path can be finite in which case it ends 
with a state. A path is complete if it is either infinite or finite ending in 
a terminal state. Given a path a, first(cj) denotes its first state, and if a 
is finite then last(a") denotes its last state. A cycle is a path a such that 
last(cr) = first((T). Let PathSg(Af) denote the set of all paths, PathSg(M) 
the set of all finite paths, and CPathSg(M) the set of all complete paths of 
an automaton M, starting from the state q. We will omit q\i q = q. Paths 
are ordered by the prefix relation, which we denote by <. The trace of a 
path is the sequence of actions in S* U Tj°° obtained by removing the states, 
hence for the above path a we have trace{a) = a\a2 ■ ■ ■■ If S' C S, then 
tracej]i{a) is the projection of trace{a) on the elements of S'. 

Let M = {Q, S, q, 9) be a (fully) probabilistic automaton, q ^ Q a state, 
and let a € PathSg(M) be a finite path starting in q. The cone generated by 
a is the set of complete paths (a) = {a' S CPathSq(M) | a < a'}. Given a 
fully probabilistic automaton M = {Q, S, q, 9) and a state q, we can calcu- 
late the probability value, denoted by ^^(cr) , of any finite path a starting in q 
as follows: Pq(g) = 1 and Pg(cT A q') = Fq{cr) fi{a,q'), where last(cj) fi. 

Let Qq = CPathSg(M) be the sample space, and let J-q be the smallest 
(T-algebra generated by the cones. Then ¥q induces a unique probability 
measure on Tq (which we will also denote by Pg) such that ¥q{{a)) = Pg(cr) 
for every finite path a starting in q. For q = q we write P instead of Pg. 

A (full-information) scheduler for a probabilistic automaton M is a 
function (: Paths*(M) (P(E x Q) U {_L}) such that for all finite paths 
a, if 9{\a5t{a)) / then ({a) € 9{\a5t{a)), and ({a) = _L otherwise. Hence, 
a scheduler selects one of the available transitions in each state, and 
determines therefore a fully probabilistic automaton, obtained by pruning 
from M the alternatives that are not chosen by ^. Note that a scheduler is 
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history dependent since it can take different decisions for the same state s 
according to the past evolution of the system. 

4.2.2 Noisy Channels 

This section briefly recalls the notion of noisy channels from Information 
Theory [CT06]. 

A noisy channel is a tuple C = {X, y, P{-\-)) where X = {xi, X2, • • • , 
is a finite set of input values, modeling the secrets of the channel, and 
y = {^1)2/2, • • • lUm} is a finite set of output values, the observables of the 
channel. For Xi G X and yj £ y, P{yj\xi) is the conditional probability 
of obtaining the output yj given that the input is Xi. These conditional 
probabilities constitute the so called channel matrix, where P{yj\xi) is the 
element at the intersection of the i-th row and the j'-th column. For any 
input distribution Px on X, Px and the channel matrix determine a joint 
probability P/s^ on X x y, and the corresponding marginal probability Py on 
3^ (and hence a random variable Y). Px is also called a priori distribution 
and it is often denoted by vr. The probability of the input given the output 
is called a posteriori distribution. 

4.2.3 Information leakage 

We recall here the definitions of multiplicative leakage proposed in [Smi09], 
and of additive leakage proposed in [BCP09]^. We assume given a noisy 
channel C = {X ,y, P{-\-)) and a random variable X on X. The a priori 
vulnerability of the secrets in X is the probability of guessing the right 
secret, defined as V{X) = maxx^^Pxi^)- The rationale behind this defi- 
nition is that the adversary's best bet is on the secret with highest prob- 
ability. The a posteriori vulnerability of the secrets in X is the probabil- 
ity of guessing the right secret, after the output has been observed, av- 
eraged over the probabilities of the observables. The formal definition is 

^The notion proposed by Smith in [Smi09] was given in a (equivalent) logarithmic 
form, and called simply leakage. For uniformity sake we use here the terminology and 
formulation of [BCP09]. 
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V{X\ Y) = Ylyey Pviv) '^^i^xex P{x \ y). Again, this definition is based on 
the principle that the adversary will choose the secret with the highest a 
posteriori probability. 

Note that, using Bayes theorem, we can write the a posteriori vulner- 
ability in terms of the channel matrix and the a priori distribution, or in 
terms of the joint probability: 



leakage is C+{C,Px) = V{X\Y) - V{X). 
4.2.4 Dining Cryptographers 

This problem, described by Chaum in [Cha88], involves a situation in which 
three cryptographers are dining together. At the end of the dinner, each of 
them is secretly informed by a central agency (master) whether he should 
pay the bill, or not. So, either the master will pay, or one of the cryptogra- 
phers will be asked to pay. The cryptographers (or some external observer) 
would like to find out whether the payer is one of them or the master. 
However, if the payer is one of them, they also wish to maintain anonymity 
over the identity of the payer. 

A possible solution to this problem, described in [Cha88], is that each 
cryptographer tosses a coin, which is visible to himself and his neighbor 
to the left. Each cryptographer observes the two coins that he can see 
and announces agree or disagree. If a cryptographer is not paying, he will 
announce agree if the two sides are the same and disagree if they are not. 
The paying cryptographer will say the opposite. It can be proved that if the 
number of disagrees is even, then the master is paying; otherwise, one of the 
cryptographers is paying. Furthermore, in case one of the cryptographers is 
paying, neither an external observer nor the other two cryptographers can 
identify, from their individual information, who exactly is paying (provided 
that the coins are fair). The Dining Cryptographers (DC) will be a running 
example through the chapter. 



V{X\Y) 




The multiplicative leakage is Cy,{C.,Px) — yix) whereas the additive 



102 Chapter 4. Information Hiding in Probabilistic Concurrent Systems 



OUtg 




Figure 4.1: Chaum's system for the Dining Cryptographers ([Cha88]) 

4.3 Systems 

In this section we describe the kind of systems we are deahng with. We 
start by introducing a variant of probabihstic automata, that we call tagged 
probabilistic automata (TPA). These systems are parallel compositions of 
purely probabilistic processes, that we call components. They are equipped 
with a unique identifier, that we call tag, or label, of the component. Note 
that, because of the restriction that the components are fully determinis- 
tic, nondeterminism is generated only from the interleaving of the parallel 
components. Furthermore, because of the uniqueness of the tags, each tran- 
sition from a node is associated to a different tag / pair of two tags (one in 
case only one component makes a step, and two in case of a synchronization 
step among two components). 

4.3.1 Tagged Probabilistic Automata 

We now formalize the notion of TPA. 

Definition 4.3.1. A tagged probabilistic automaton (TPA) is a tuple (Q, 
L, S, q, 9), where 
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• Q is a set of states, 

• L is a set of tags, or labels, 

• S is a set of actions, 

• g € Q is the initial state, 

• 9: Q ^ V{L X D{Ti X Q)) is a transition function. 

with the additional requirement that for every q Q and every i & L there 
is at most one n € D(Ti x Q) such that € ^(9)- 

A path for a TPA is a sequence a = qq gi '-^^ (72 • • • • In this way, 
the process with identifier li induces the system to move from to qi 
performing the action aj, and it does so with probabihty fj,i^{ai,qi), where 
Hi- is the distribution associated to the choice made by the component li. 
Finite paths and complete paths are defined in a similar manner. 

In a TPA, the scheduler's choice is determined by the choice of the tag. 
We will use enah{q) to denote the tags of the components that are enabled 
to make a transition. Namely, 

enah{q) = € L | 3^eI?(S x Q) : e e{q)} (4.1) 

We assume that the scheduler is forced to select a component among 
those which are enabled, i.e., that the execution does not stop unless all 
components are blocked (suspended or terminated) . This is in line with the 
spirit of process algebra, and also with the tradition of Markov Decision 
Processes, but contrasts with that of the Probabilistic Automata of Lynch 
and Segala [SL95]. However, the results in this chapter do not depend on 
this assumption; we could as well allow schedulers which decide to terminate 
the execution even though there are transitions which are possible from the 
last state. 

Definition 4.3.2. A scheduler for a TPA M = {Q,L,T.,q,e) is a func- 
tion Q: Paths*(M) (L U {±}) such that for ah finite paths a, C(o-) G 
ena6(last(cj)) if ena6(last((T)) 7^ and C,{n) = ± otherwise. 
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4.3.2 Components 



To specify the components we use a sort of probabilistic version of CCS [Mil89, 
Mil99]. We assume a set of secret actions S5 with elements s, si, S2, ■ ■ ■ , and 
a disjoint set of observable actions So with elements a, ai, 02, • • • . Further- 
more we have communication actions of the form c{x) (receive x on channel 
c, where a; is a formal parameter), or c{v) (send v on channel c, where v 
is a value on some domain V). Sometimes we need only to synchronize 
without transmitting any value, in which case we will use simply c and c. 
We denote the set of channel names by C. 

A component q is specified by the following grammar: 



Components 





a.q 

if X = V then qi else q2 
A 



termination 
observable prefix 
blind choice 
secret choice 
conditional 
process call 



Observables 



c I c simple synchronization 

c{x) I c{v) synchronization and communication 



The Pi, in the blind and secret choices, represents the probability of 
the i-th. branch and must satisfy < Pi < 1 and ^^Pi = 1- When no 
confusion arises, we use simply + for a binary choice. The process call A 
is a simple process identifier. For each of them, we assume a corresponding 

def 

unique process declaration of the form A = q. The idea is that, whenever 
A is executed, it triggers the execution of q. Note that q can contain A or 
another process identifier, which means that our language allows (mutual) 
recursion. 
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Note that each component contains only probabihstic and sequential 
constructs. In particular, there is no internal parallelism. Hence each com- 
ponent corresponds to a purely probabilistic automaton (apart from the 
input nondeterminism, which disappears in the definition of a system), as 
described by the operational semantics below. The main reason to dismiss 
the use of internal parallelism is verification: as mentioned in the Intro- 
duction we will present a proof technique for the different definitions of 
anonymity proposed in this work. This result would not be possible with- 
out such restriction on the components (see Example 4.6.4). 

For an extension of this framework allowing the use of internal par- 
allelism we refer to [AAPvRlO]. There, the authors combine global non- 
determinism (arising from the interleaving of the components) and local 
nondeterminism (arising from the internal parallelism of the components) . 
The authors use such (extended) framework for a different purpose than 
ours, namely to define a notion of equivalence suitable for security analysis. 
No verification mechanisms are provided in [AAPvRlO]. 

Components ' semantics: The operational semantics consists of probabilistic 
transitions of the form q^fj, where g € Q is a process, and fi S P(S x Q) is 
a distribution on actions and processes. They are specified by the following 
rules: 



PRFl 



c{x).q — > 6{c{v),q[v/x]) 



PRF2 



a.q — >■ 5{a, q) 



if a 7^ c(x) 



INT 



SECR 



J2iPi ■ Si.qi ^KiPi- Hsi,qi 



) 
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CNDl 

if V = V then qi else q2 — )• 5(r, gi) 

V V' 

CND2 

if V = v' then qi else q2 6{t, 52) 

CALL iiA = q 

^^Pi • Hi is the distribution ^ such that fi{x) = J2iPif^ii^)- We use 5{x) 
to represent the delta of Dirac, which assigns probabihty 1 to x. The silent 
action, r, is a special action different from all the observable and the secret 
actions. q[v/x] stands for the process q in which any occurrence of x has 
been replaced by v. To shorten the notation, in the examples throughout 
the chapter, we omit writing explicit termination, i.e., we omit the symbol 
at the end of a term. 

4.3.3 Systems 

A system consists of n processes (components) in parallel, restricted at the 
top-level on the set of channel names C : 

(C) Ql II 92 II • • • II Qn- 

The restriction on C enforces synchronization (and possibly communica- 
tion) on the channel names belonging to C, in accordance with the CCS 
spirit. Since C is the set of all channels, all of them are forced to syn- 
chronize. This is to eliminate, at the level of systems, the nondeterminism 
generated by the rule for the receive prefix, PRFl. 

Systems ' semantics: The semantics of a system gives rise to a TPA, where 
the states are terms representing systems during their evolution. A transi- 
tion now is of the form q ^ fi where € (^^(S x Q)) and ^ E L is either 
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the identifier of tfie component wliich makes the move, or a two-element 
set of identifiers representing the two partners of a synchronization. The 
following two rules provide the operational semantics rules in the case of 
interleaving and synchronisation/communication, respectively. 



Interleaving If aj ^ C 

Qi KjPj ■ Haj,qij) 
(C) qi \\ ■ ■ ■ \\ Qi \\ \\ Qn ^ Ej Pj ■ H^j, (C) gi II • • • II Qij II • • • II Qn) 

where i indicates the tag of the component making the step. 



Synchronization/Communication 

qi S{c{v),q'i) qj 6{c{v),qj) 
(C) 91 II • • • II 9i II • • • II 9n ^ <J(t, (C) 91 II • • • II q'i II • • • II II • • • II Qn) 

here {i,j} is the tag indicating that the components making the step are 

i and j. For simplicity we write instead of ^. The rule for synchro- 
nization without communication is similar, the only difference is that we do 
not have (v) and (v) in the actions. Note that c can only be an observable 
action (neither a secret nor r), by the assumption that channel names can 
only be observable actions. 

We note that both interleaving and synchronization rules generate non- 
determinism. The only other source of nondeterminism is PRFl, the rule 
for a receive prefix c{x). However the latter is not real nondeterminism: 
it is introduced in the semantics of the components but it disappears in 
the semantics of the systems, given that the channel c is restricted at the 
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top-level. In fact the restriction enforces communication, and when com- 
munication takes place, only the branch corresponding to the actual value v 
transmitted by the corresponding send action is maintained, all the others 
disappear. 

Proposition 4.3.3. The operational semantics of a system is a TPA with 
the following characteristics: 

(a) Every step q ^ is either 

a blind choice: // = £^ pi ■ 6{T,qi), or 

a secret choice: fi = ^^Pi ■ S{si,qi), or 

a delta of Dirac: /i = 6{a, q') with a € So or a = t. 

(b) If q ^ fi and q — ?> /.i' then fi = fi' . 

Proof. For (a) , we have that the rules for the components and the rule 
for synchronization / communication can only produce blind choices, se- 
cret choices, or deltas of Dirac. Furthermore, because of the restriction on 
all channels, the transitions at the system level cannot contain communi- 
cation actions. Finally, observe that the interleaving rule maintains these 
properties. 

As for (b), we know that at the component level, the only source of 
nondeterminism is PRFl, the rule for a receive prefix c{x). At the sys- 
tem level, this action is forced to synchronize with a corresponding send 
action, and, in a component, there can be only one such action available 
at a time. Hence the tag determines the value to be sent, which in turn 
determines the selection of exactly one branch in the receiving process. 
The only other sources of nondeterminism are the interleaving and the syn- 
chronization/communication rules, and they induce a different tag for each 
alternative. □ 

Example 4.3.1. We now present the components for the Dining Cryptog- 
raphers using the introduced syntax. They correspond to Figure 4.1 and 
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to the automata depicted in Figure 4.3. As announced before, we omit the 
symbol for exphcit termination at the end of each term. The secret ac- 
tions Si represent the choice of the payer. The operators ©, G represent the 
sum modulo 2 and the difference modulo 2, respectively. The test i == n 
returns 1 (true) if i = n, and otherwise. The set of restricted channel 
names is C = {co,o, co,i, ci,i, ci,2, C2,o, C2,2, mo, mi, 7^2}. 

Master = p : mo{0) .mi{0) .m2{0) + {I - p) : ^11=0 Pi ■ ■ 

mo(i == 0) .rni{i == 1) .rn2{i == 2) 
Crypt j = mi{pay) . Ci.i{coini) . Ci ^1(^1(001712) ■ outi{pay © coini © coin2) 

Coiuj = 0.5 : Ci,i(0) .Ciei,i(0) + 0.5 : Ci,i(l) . Ciei,i(l) 
System = (C) Master || ni=o ^^yP^j II ni=o Coiuj 

Figure 4.2: Dining Cryptographers CCS 

The operation pay © coirii © coin2 in Figure 4.2 is syntactic sugar, it 
can be defined using the if-then-else operator. Note that, in this way, if 
a cryptographer is not paying (pay = 0), then he announces if the two 
coins are the same (agree) and 1 if they are not (disagree) . 



4.4 Admissible Schedulers 

We now introduce the class of admissible schedulers. 

Standard (full-information) schedulers have access to all the informa- 
tion about the system and its components, and in particular the secret 
choices. Hence, such schedulers can leak secrets by making their decisions 
depend on the secret choice of the system. This is the case with the Dining 
Cryptographers protocol of Section 4.2.4: among all possible schedulers for 
the protocol, there are several that leak the identity of the payer. In fact 
the scheduler has the freedom to decide the order of the announcements 
of the cryptographers (interleaving), so a scheduler could choose to let the 
payer announce lastly. In this way, the attacker learns the identity of the 
payer simply by looking at the interleaving of the announcements. 
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Master Coirij Cryptj 




Figure 4.3: Dining Cryptographers Automata 



4.4.1 The screens intuition 

Let us first describe admissible scliedulers informally. As mentioned in the 
introduction, admissible schedulers can base their decisions only on partial 
information about the evolution of the system, in particular admissible 
schedulers cannot base their decisions on information concerned with the 
internal behavior of components (such as secret choices). 

We follow the subsequent intuition: admissible schedulers are entities 
that have access to a screen with buttons, where each button represents 
one (current) available option. At each point of the execution the sched- 
uler decides the next step among the available options (by pressing the 
corresponding button). Then the output (if any) of the selected compo- 
nent becomes available to the scheduler and the screen is refreshed with 
the new available options (the ones corresponding to the system after mak- 
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ing the selected step). We impose that the scheduler can base its decisions 
only on such information, namely: the screens and outputs he has seen up 
to that point of the execution (and, of course, the decisions he has made). 

Example 4.4.1. Consider S = ({ci,C2}) r||g||t, where 

r = 0.5 : si.ci.C2 + 0.5 : S2-Ci.C2, 
q = ci.(0.5 : ai + 0.5 : 6i), t = C2.(0.5 : 02 + 0.5 : 62). 

Figure 4.4 shows the sequence of screens corresponding to a particular 
sequence of choices taken by the scheduler^. Interleaving and communica- 
tion options are represented by yellow and red buttons, respectively. An 
arrow between two screens represents the transition from one to the other 
(produced by the scheduler pressing a button), additionally, the decision 
taken by the scheduler and corresponding outputs are depicted above each 
arrow. 

Figure 4.4: Screens intuition 

Note that this system has exactly the same problem as the DC pro- 
tocol: a full-information scheduler could reveal the secret by basing the 
interleaving order [q first or t first) on the secret choice of the component 
r. However, the same does not hold anymore for admissible schedulers (the 
scheduler cannot deduce the secret choice by just looking at the screens 
and outputs). This is also the case for the DC protocol, i.e., admissible 
schedulers cannot leak the secret of the protocol. 

■^The transitions from screens 4 and 5 represent 2 steps each (for simplicity we omit 
the r-steps generated by blind choices) 
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4.4.2 The formalization 

Before formally defining admissible schedulers we need to formalize the 
ingredients of the screens intuition. The buttons on the screen (available 
options) are the enabled options given by the function enab (see (4.1) in 
Section 4.3.1), the decision made by the scheduler is the tag of the selected 
enabled option, observable actions are obtained by sifting the secret actions 
to the schedulers by means of the following function: 

a if a G So U {r}, 
r if a G S5. 

The partial information of a certain evolution of the system is given by the 
map t defined as follows. 

Definition 4.4.1. Let q ^-^-^ • • • ^^1^ Qn+i be a finite path of the system, 
then we define t as: 

t[q — > ■■■ — > qn+i j = 

{enab (q), ii, sift (ai)) ■■■ {enab (qn), in, sift (an)) ■ enab{qn+i). 

Finally, we have all the ingredients needed to define admissible sched- 
ulers. 

Definition 4.4.2 (Admissible schedulers). A scheduler ( is admissible if 
for ah a, a' G Paths* 

t{a) = t{a') implies C(<^) = C{<^')- 

In this way, admissible schedulers are forced to take the same decisions 
on paths that they cannot tell apart. Note that this is a restriction on the 
original definition of (full-information) schedulers where t is the identity 
map over finite paths (and consequently the scheduler is free to choose 
differently) . 

In the kind of systems we consider (the TPAs) the only source of nonde- 
terminism are the interleaving and interactions of the parallel components. 
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Consequently, in a TPA the notion of scheduler is quite simple: its role, in- 
deed, is to select, at each step, the component or pair of components which 
will perform the next transition. In addition, the TPA model allows us to 
express in a simple way the notion of admissibility: in fact the transitions 
available in the last state of a are determined by the set of components en- 
abled in the last state of a, and t{a) gives (among other information) such 
set. Therefore t(a) = t(a') implies that the last states of a and a' have the 
same possible transitions, hence it is possible to require that C{cr) = C(c') 
without being too restrictive or too permissive. In more general systems, 
where the sources of nondeterminism can be arbitrary, it is difficult to im- 
pose that the scheduler "does not depend on the secret choices", because 
different secret choices in general may give rise to states with different sets 
of transitions, and it is unclear whether such difference should be ruled 
out as "inadmissible", or should be considered as part of what a "real" 
scheduler can detect. 

4.5 Information-hiding properties in presence of 
nondeterminism 

In this section we revise the standard definition of information flow and 
anonymity in our framework of controlled nondeterminism. 

We first consider the notion of adversary. We consider three possible 
notions of adversaries, increasingly more powerful. 

4.5.1 Adversaries 

External adversaries: Clearly, an adversary should be able, by definition, 
to see at least the observable actions. For an adversary external to the 
system S, it is natural to assume that these are also the only actions that 
he is supposed to see. Therefore, we define the observation domain, for an 
external adversary, as the set of the (finite) sequences of observable actions, 
namely: 
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sieve(a) = 



Correspondingly, we need a function : Paths* (S") Oc that extracts the 
observables from the executions: 

te [qo — > ■ ■ ■ — > q-n+i I =sieve[ai) ■ ■ ■ sieve[an) 

where 

a if a € Sq, 

e ifaGSsU{T}. 

Internal adversaries: An internal adversary may be able to see, besides 
the observables, also the intear leaving and synchronizations of the various 
components, i.e. which component(s) are active, at each step of the execu- 
tion. Hence it is natural to define the observation domain, for an internal 
adversary, as the sequence of pairs of observable action and tag (i.e. the 
identifier(s) of the active component (s)), namely: 

Oi = (ix(SoU{T}))*. 

Correspondingly, we need a function t[ : Paths* (S) — t- Oi that extracts the 
observables from the executions: 

U (qo ■ ■ ■ qn+ij = (h, sieve{ai)) ■ ■ ■ {in, sieve{an)). 

Note that in this definition we could have equivalently used sift instead 
than sieve. 

Adversaries in collusion with the scheduler: Finally, we consider the case 
in which the adversary is in collusion with the scheduler, or possibly the 
adversary is the scheduler. To illustrate the difference between this kind of 
adversaries and internal adversaries, consider the scheduler of an operating 
system. In such scenario an internal adversary is able to see which process 
has been scheduled to run next (process in the "running state" ) whereas an 
adversary in collusion with the scheduler can see as much as the scheduler, 
thus being able to see (in addition) which processes are in the "ready state" 
and which processes are in the "waiting / blocked" state. We will show 
later that such additional information does not help the adversary to leak 
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information (see Proposition 4.5.9). The observation domain of adversaries 
in collusion with the scheduler coincides with the one of the scheduler: 

Os = (PW xLx (SoUM))*. 

The corresponding function 

ts : Paths*(5) ^ Os 

is defined as the one of the scheduler, i.e. tg = t. 

4.5.2 Information leakage 

In Information Flow and Anonymity there is a converging consensus for 
formalizing the notion of leakage as the difference or the ratio between the 
a priori uncertainty that the adversary has about the secret, and the a pos- 
teriori uncertainty, that is, the residual uncertainty of the adversary once 
it has seen the outcome of the computation. The uncertainty can be mea- 
sured in different ways. One popular approach is the information-theoretic 
one, according to which the system is seen as a noisy channel between the 
secret inputs and the observable output, and uncertainty corresponds to the 
Shannon entropy of the system (see preliminaries - Section 4.2). In this 
approach, the leakage is represented by the so-called mutual information, 
which expresses the correlation between the input and the output. 

The above approach, however, has been recently criticized by Smith 
[Smi09], who has argued that Shannon entropy is not suitable to represent 
the security threats in the typical case in which the adversary is interested 
in figuring out the secret in one-try attempt, 

and he has proposed to use Renyi's min entropy instead, or equivalently, 
the average probability of succeeding. This leads to interpret the uncer- 
tainty in terms of the notion of vulnerability defined in the preliminaries 
(Section 4.2). The corresponding notion of leakage, in the pure probabilistic 
case, has been investigated in [Smi09] (multiplicative case) and in [BCP09] 
(additive case). 
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Here we adopt the vulnerability-based approach to define the notion of 
leakage in our probabilistic and nondeterministic context. The Shannon- 
entropy-based approach could be extended to our context as well, because 
in both cases we only need to specify how to determine the conditional 
probabilities which constitute the channel matrix, and the marginal prob- 
abilities that constitute the input and the output distribution. 

We will denote by S the random variable associated to the set of secrets 
S = Tis, and by the random variables associated to the set of observables 
Ox, where x G {e, i, s}. So, Ox represents the observation domains for the 
various kinds of adversaries defined above. 

As mentioned before, our results require some structural properties for 
the system: we assume that there is a single component in the system con- 
taining a secret choice and this component contains a single secret choice. 
This hypothesis is general enough to allow expressing protocols like the 
Dining Cryptographers, Crowds, voting protocols, etc., where the secret is 
chosen only once. 

Assumption 4.5.1. A system contains exactly one component with a syn- 
tactic occurrence of a secret choice, and such a choice does not occur in the 
scope of a recursive call. 

Note that the assumption implies that the choice appears exactly once 
in the operational semantics of the component. It would be possible to re- 
lax the assumption and allow more than one secret choice in a component, 
as long as there are no observable actions between the secret choices. For 
the sake of simplicity in this paper we impose the more restrictive require- 
ment. As a consequence, we have that the operational semantics of systems 
satisfies the following property: 

Proposition 4.5.2. If q ^ fi and q n are both secret choices, then 
i = i' and there exist pi 's, qi 's and q'- 's such that: 




and 




i.e., fi and fi' 



differ only for the continuation states. 
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Proof. Because of Assumption 4.5.1, there is only one component that can 
generate a secret choice, and it generates only one such choice. Due to 
the different possible interleavings, this choice can appear as an outgoing 
transition in more than one state of the TPA, but the probabilities are 
always the same, because the interleaving rule does not change them. □ 

Given a system, each scheduler ^ determines a fully probabilistic au- 
tomaton, and, as a consequence, the probabilities 

P^(s,o) =P^ (^|J{(f7) I a e Paths* {S),t^{a) = o, seer (a) = s } 

for each secret s G 5 and observable a G Ox, where x G {e, i, s}. Here seer 
is the map from paths to their secret action. From these we can derive, in 
standard ways, the marginal probabilities P^ (s), P^ (o), and the conditional 
probabilities P^ (o | s). 

Every scheduler leads to a (generally different) noisy channel, whose 
matrix is determined by the conditional probabilities as follows: 

Definition 4.5.3. Let x G {e, i, s}. Given a system and a scheduler the 
corresponding channel matrix has rows indexed by s G 5 and columns 
indexed by o G Ox- The value in (s, o) is given by 

Pa o s = — r — r-:— . 

Given a scheduler the multiplicative leakage can be defined as Cx (C^, -P(), 
while the additive leakage can be defined as >C+(C^,P^) where is the a 
priori distribution on the set of secrets (see preliminaries. Section 4.2). 
However, we want a notion of leakage independent from the scheduler, and 
therefore it is natural to consider the worst case over all possible admissible 
schedulers. 

Definition 4.5.4 (x-leakage). Let x G {e, i, s}. Given a system, the multi- 
plicative leakage is defined as 

MCI = max CxiQ,Pc), 
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while the additive leakage is defined as 

where Adm is the class of admissible schedulers defined in the previous 
section. 

We have that the classes of observables e, i, and s determine an increas- 
ing degree of leakage: 

Proposition 4.5.5. Given a system, for the multiplicative leakage we have 

1. For every scheduler (, £x (Q, ^'c) ^ {C[, P^) < (Q, P() 

2. MC% < M£\, < MC% 
Similarly for the additive leakage. 
Proof. 

1. The property follows immediately from the fact that the domain Oe 
is an abstraction of Oj, and Oi is an abstraction of Og. 

2. Immediate from previous point and from the definition of and 
MCI. □ 

4.5.3 Strong anonymity (revised) 

We consider now the situation in which the leakage is the minimum for all 
possible admissible schedules. In the purely probabilistic case, we know that 
the minimum possible multiplicative leakage is 1, and the minimum possible 
additive one is 0. We also know that this is the case for all possible input 
distributions if and only if the capacity of the channel matrix is 0, which 
corresponds to the case in which the rows of the matrix are all the same. 
This corresponds to the notion of strong probabilistic anonymity defined 
in [BP05]. In the framework of information flow, it would correspond to 
probabilistic non-interference. Still in [BP05], the authors considered also 
the extension of this notion in presence of nondeterminism, and required 
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the condition to hold under ah possible schedulers. This is too strong in 
practice, as we have argued in the introduction: in most cases we can build 
a scheduler that leaks the secret by changing the interleaving order. We 
therefore tune this notion by requiring the condition to hold only under the 
admissible schedulers. 

Definition 4.5.6 (x-strongly anonymous). Let x S {e, i, s}. We say that a 
system is x- strongly- anonymous if for all admissible schedulers C we have 

(O I Sl) = Pc (O I S2) 

for all si, S2 G Sg, and o G O^- 

The following corollary is an immediate consequence of Proposition 4.5.5. 
Corollary 4.5.7. 

1. If a system is s-strongly-anonymous, then it is also i-strongly-anonymous. 

2. If a system is i-strongly-anonymous, then it is also e-strongly-anonymous. 

The converse of point (2), in the previous corollary, does not hold, as 
shown by the following example: 

Example 4.5.8. Consider the system 5 = ({ci, C2}) P || Q || T where 

P = (0.5 : si . ci) + (0.5 : S2 . C2) Q = ci . o T = C2.o 

It is easy to check that S is e-strongly anonymous but not i-strongly 
anonymous, showing that (as expected) internal adversaries can "distin- 
guish more" than external adversaries. 

On the contrary, for point (1) of Corollary 4.5.7, also the other direction 
holds: 

Proposition 4.5.9. A system is s-strongly-anonymous if and only if it is 
i-strongly-anonymous. 
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Proof. Corollary 4.5.7 ensures the only-if part. For the if part, we proceed 
by contradiction. Assume that the system is i-strongly-anonymous but 
that (o I si) 7^ (o I S2) for some admissible scheduler ( and observable 
£ Os- Let o = {enab{q), ii, sift{ai)) • • • {enab{qn), in, sift{an)) and let 0' 
be the projection of o on Oi, i.e. 0' = {ii, sift{ai)) ■ ■ ■ {in, sift{an))- Since 
the system is i-strongly-anonymous, (o' | si) = P^^ (o' | S2), which means 
that the difference in probability with respect to must be due to at least 
one of the sets of enabled processes. Let us consider the first set L in o 
which exhibits a difference in the probabilities, and let 0" be the prefix of 
up to the tuple containing L. Since the probabilities are determined by 
the distributions on the probabilistic choices which occur in the individual 
components, the probability of each £ S L to be available (given the trace 
o") is independent of the other labels in L. At least one such i must 
therefore have a different probability, given the trace o", depending on 
whether the secret choice was si or S2- And, because of the assumption on 
L, we can replace the conditioning on trace 0" with the conditioning on the 
projection 0"' of o" on Oi. Consider now an admissible scheduler d^' that 
acts like C up to 0", and then selects i if and only if it is available. Since 
the probability that i is not available depends on the choice of si or S2, we 
have P( [a'" | si) / P^; (o'" | S2), which contradicts the hypothesis that the 
system is i-strongly-anonymous. □ 

Intuitively, this result means that an s-adversary can leak information 
if and only if an i-adversary can leak information or, in other words, s- 
adversaries are as powerful as i-adversaries (even when the former can ob- 
serve more information). 

4.6 Verifying strong anonymity: a proof technique 
based on automorphisms 

As mentioned in the introduction, several problems involving restricted 
schedulers have been shown undecidable (including computing maximum 
/ minimum probabilities for the case of standard model checking [GD07, 
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Gir09]). These results are discouraging in the aim to find algorithms for 
verifying strong anonymity /non-interference using our notion of admissible 
schedulers (and most definitions based on restricted schedulers). Despite 
the fact that the problem seems to be undecidable in general, in this sec- 
tion we present a sufficient (but not necessary) anonymity proof technique: 
we show that the existence of automorphisms between each pair of secrets 
implies strong anonymity. We conclude this section illustrating the appli- 
cability of our proof technique by means of the DC protocol, i.e., we prove 
that the protocol does not leak information by constructing automorphisms 
between pairs of cryptographers. It is worth mentioning that our proof tech- 
nique is general enough to be used for the analysis of information leakage 
of a broad family of protocols, namely any protocol that can be modeled 
in our framework. 

4.6.1 The proof technique 

In practice proving anonymity often happens in the following way. Given a 
trace in which user A is the 'culprit', we construct an observationally equiv- 
alent trace in which user B is the 'culprit' [HO05, GHvRPOS, MVdV04, 
HK07c] . This new trace is typically obtained by 'switching' the behavior of 
users A and B. We formalize this idea by using the notion of automorphism, 
cf. e.g. [RutOO]. 

Definition 4.6.1 (Automorphism). Given a TPA (Q, L, S, q, 6) we say that 
a bijection / : Q — >■ Q is an automorphism if it satisfies f{q) = q and 

q \ • b{ai,q^ /(g) \ • 6{ai,f{qi)). 

i i 

In order to prove anonymity it is sufficient to prove that the behaviors 
of any two 'culprits' can be exchanged without the adversary noticing. We 
will express this by means of the existence of automorphisms that exchange 
a given pair of secret si and Sj . 

Before presenting the main theorem of this section we need to introduce 
one last definition. Let S = (C) gi 1 1 • • • 1 1 g„ be a system and M its corre- 
sponding TPA. We define M^- as the automaton obtained after "hiding" all 
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the secret actions of M. The idea is to replace every occurrence of a secret 
s in M by the silent action r. Note that this can be formalized by replacing 
the secret choice by a blind choice in the corresponding component qi of 
the system S. 

We now formalize the relation between automorphisms and strong ano- 
nymity. We will first show that the existence of automorphisms exchanging 
pairs of secrets implies s-strong anonymity (Theorem 4.6.2). Then, we will 
show that the converse does not hold, i.e. s-strongly-anonymous systems 
are not necessarily automorphic (Example 4.6.3). 

Theorem 4.6.2. Let S be a system satisfying Assumption 4-5.1 and M its 
tagged probabilistic automaton. If for every pair of secrets Si,Sj € Ss there 
exists an automorphism f of Mr such that for any state q we have 

q q' =^ f{q) /(?'), (4-2) 

then S is s- strongly- anonymous. 

Proof. Assume that for every pair of secrets Sj , Sj we have an automorphism 
/ satisfying the hypothesis of the theorem. We have to show that, for every 
admissible scheduler Q we have: 

VoGO, : ¥^{o \ si)=¥^{o \ sg) . 

We start by observing that for Sj, by Proposition 4.5.2, there exists a 
unique pi such that, for all transitions g ^ /i, if /i is a (probabilistic) secret 
choice, then /i(sj, — ) = pi. Similarly for Sj, there exists a unique pj such 
that /i(sj, — ) = Pj for all secret choices /i. 

Let us now recall the definition of (o | s): 

"c [o \ s) = — ^- — -- — 

where (o A s) = P^ ({vrGCPaths | ts{'K) = o /\ secr{TT) = s}) with secr(7r) 
being the (either empty or singleton) sequence of secret actions of vr, and 
P^ (s) ^ P^ ({vr € CPaths | secr(7r) = s}) . 
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Note that, since a secret appears at most once on a complete path, we have: 
(si) = (^{^ ^ G CPaths I vr, a} 

Ls 



T—^iJie Paths* last(7r)->^ 

secret choice 

and analogously 

(sj) = p^ ^{vr ^ 0- G CPaths | vr, ct} 

vr^gjGPaths* last(7r)AAt 

// secret choice 

Let US now consider P^ (o | Sj) and P^^ (o j Sj). We have: 

P^ (o A Si) = P^- (^1^ ^ a € CPaths j t,(7r ^ a) = o}) 

= Pc(^)-Pi- E iPc(^) 

last{7r)-i>^ TT-^o-ePaths* 
u secret choice , l,si , , , 

ts(■JT~^a)=o^\ast{te(a))^T 

again using that a secret appears at most once on a complete path. More- 
over, note that we have overloaded the notation P<^ by using it for different 
measures when writing P^ (cr), since a need not start in the initial state q. 
Analogously we have: 

P^ (o A Sj) = ¥c_ {^TT ^ 0- G CPaths \t,{'K^a) = 

= E ^d^)-P3- E ^c(^) 



last(7r)->/i jr^o-GPaths* 
/i secret choice £,s 

ts (tt— 4- a)=o/\\3st(te {o-))7^r 
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Therefore, we derive 



_ Pc(^)-iPc(^) 

fT 

TT ^CtGF 

/J, secret choice 



TT fT 

last(7r)-)-^ 7r_^(jgPaths 



g 

last(7r)->/i 

/X secret choice 



last(7r)4At TT—io-ePaths* 
/i secret choice £,sj 
„ , I . (.{TT— i<7)=oAlast{t,{a))7^r 

F<(oU,) = ^-J^^ (4.4) 

last(7r)— >-/i 
secret choice 

Observe that the denominators of both formulae (4.3) and (4.4) are the 
same. Also note that, since / is an automorphism, for every path vr, /(vr) 
obtained by replacing each state in vr with its image under / is also a 
path. ^Moreover, since / satisfies (4.2), for every path vr ^-4 a we have that 
/(tt) -— ^ /(<7) is also a path. Furthermore / induces a bijection between 
the sets 

{(vr, cr) I last(7r) — /x s.t. [i secret choice, vr a G Paths* 
ts(7r o") = o, last(te(o')) 7^ 7" }, and 

{(vr, 0") I last(7r) — )• /i s.t. /i secret choice, vr — ^ cr G Paths* 

i s ■ 

ts{TT ^ a) = o,\ast{te{a)) ^ T } 

given by (7r,o-) ^ {f{-K),f{a)). 

Finally, since C is admissible, ts(7r) = is(/(vr)), and / is an automor- 
phism, it is easy to prove by induction that (vr) = (/(vr)). Similarly, 

(^) ~ ifi^))- Hence the numerators of (4.3) and (4.4) coincide which 
concludes the proof. □ 
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Note that, since s-strong anonymity implies i-strong anonymity and e- 
strong anonymity, the existence of such an automorphism imphes all the 
notions of strong anonymity presented in this work. We now proceed to 
show that the converse does not hold, i.e. strongly anonymous systems are 
not necessarily automorphic. 

Example 4.6.3. Consider the following (single component) system 

0.5 : si.(0.5 : {p : a+{l-p) : 6) + 0.5 : {(l-p) : a + p : b)) 

+ 

0.5 : 52.(0.5 ■.{q:a + (l-q) : b) + 0.5 : {{1-q) : a + q : b)) 

It is easy to see that such system is s-strongly-anonymous, however p ^ q 
and p ^ 1 — q there does not exist an automorphism for the pair of secrets 

(S1,S2)- 

The following example demonstrates that our proof technique does not 
carry over to systems whose components admit internal parallelism. 

Example 4.6.4. Consider S = ({ci,C2}) r||(7||t, where 

r = 0.5 : si.ci + 0.5 : S2-C2, q = ci.{a\b), t = C2.(a | b). 

where qi\q2 represents the parallel composition of qi and q2- It is easy to 
show that there exists an automorphism for si and S2- However, admissible 
schedulers are able to leak such secrets. This is due to the fact that compo- 
nent r synchronizes with q and t on different channels, thus a scheduler of 
5* is not restricted to select the same transitions on the branches associated 
to si and S2 (remember that schedulers can observe synchronization). 

We now show that the definition of x-strong-anonymity is independent 
of the particular distribution over secrets, i.e., if a system is x-strongly- 
anonymous for a particular distribution over secrets, then it is x-strongly- 
anonymous for all distributions over secrets. This result is useful because 
it allows us to prove systems to be strongly anonymous even when their 
distribution over secrets is not known. 
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Theorem 4.6.5. Consider a system S = (C) (Zi || • • • || 9i || • • • || Qn- Let 

Qi be the component which contains the secret choice, and assume that it is 
of the form 'Yl,jPj '■ - Qj- Consider now the system S' = (C) <?! || • • • || 
II • • • II Qn, where q[ is identical to qi except for the secret choice, which 
is replaced by p'- : Sj . qj . Then we have that: 

1. For every Si, sj there is an automorphism on S satisfying the assump- 
tion of Theorem 4-6.2 if and only if the same holds for S' . 

2. S is X- strongly- anonymous if and only if S' is x-strongly-anonymous. 

Note: 1) does not imply 2), because in principle neither S not S' may have 
the automorphism, and still one of the two could be strongly anonymous. 

Proof. We note that the PAs generated by S and S' coincide except for the 
probability distribution on the secret choices. Since the definition of auto- 
morphism and the assumption of Theorem 4.6.2 do not depend on these 
probability distributions, (1) is immediate. As for (2), we observe that x- 
strong anonymity only depends on the conditional probabilities (o | s). 
By looking at the proof of Theorem 4.6.2, we can see that in the com- 
putation of (o I s) the probabilities on the secret choices (i.e. the p/s) 
are eliminated. Namely (o | s) does not depend on the Pj's, which means 
that the value of the pj 's has no influence on whether the system is x-strong 
anonymous or not. □ 

4.6.2 An Application: Dining Cryptographers 

Now we show how to apply the proof technique presented in this section to 
the Dining Cryptographers protocol. Concretely, we show that there exists 
an automorphism / exchanging the behavior of the Crypto and Crypti; by 
symmetry, the same holds for the other two combinations. 

Consider the automorphisms of Master and Coini indicated in Figure 
4.5. The states that are not explicitly mapped (by a dotted arrow) are 
mapped to themselves. 
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Figure 4.5: Automorphism between Crypto and Crypti 

Also consider the identity automorphism on Cryptj (for i = 0, 1, 2) and 
on Coiuj (for i = 0,2). It is easy to check that the product of these seven 
automorphisms is an automorphism for Crypto and Crypti. 

4.7 Related Work 

The problem of the full-information scheduler has already been extensively 
investigated in literature. The works [CCK+06a] and [CCK+06b] consider 
probabilistic automata and introduce a restriction on the scheduler to the 
purpose of making them suitable to applications in security. Their ap- 
proach is based on dividing the actions of each component of the system 
in equivalence classes (tasks). The order of execution of different tasks is 
decided in advance by a so-called task scheduler. The remaining nondeter- 
minism within a task is resolved by a second scheduler, which models the 
standard adversarial scheduler of the cryptographic community. This sec- 
ond entity has limited knowledge about the other components: it sees only 
the information that they communicate during execution. Their notion of 
task scheduler is similar to our notion of admissible scheduler, but more 
restricted since the strategy of the task scheduler is decided entirely before 
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the execution of the system. 

Another work along these hnes is [dAHJOl], which uses partitions on 
the state-space to obtain partial-information schedulers. However that work 
considers a synchronous parallel composition, so the setting is rather dif- 
ferent from ours. 

The works in [CPIO, CNP09] are similar to ours in spirit, but in a sense 
dual from a technical point of view. Instead of defining a restriction on the 
class of schedulers, they provide a way to specify that a choice is transparent 
to the scheduler. They achieve this by introducing labels in process terms, 
used to represent both the states of the execution tree and the next action or 
step to be scheduled. They make two states indistinguishable to schedulers, 
and hence the choice between them private, by associating to them the same 
label. Furthermore, their "equivalence classes" (schedulable actions with 
the same label) can change dynamically, because the same action can be 
associated to different labels during the execution. 

In [AAPvRlO] we have extended the framework presented in this work 
(by allowing internal nondeterminism and adding a second type of scheduler 
to resolve it) with the aim of investigating angelic vs demonic nondetermin- 
ism in equivalence-based properties. 

The fact that full-information schedulers are unrealistic has also been 
observed in fields other than security. With the aim to cope with gen- 
eral properties (not only those concerning security), first attempts used 
restricted schedulers in order to obtain rules for compositional reason- 
ing [dAHJOl]. The justification for those restricted schedulers is the same 
as for ours, namely, that not all information is available to all entities in 
the system. Later on, it was shown that model checking is undecidable in 
its general form for the kind of restricted schedulers presented in [dAHJOl]. 
See [GD07] and, more recently, [Gir09]. 

Finally, to the best of our knowledge, this is the first work using auto- 
morphisms as a sound proof technique (in our case to prove strong anony- 
mity and non-interference). The closest line of work we are aware of is in 
the field of model checking. There, isomorphisms can be used to identify 
symmetries in the system, and such symmetries can then be exploited to 
alleviate the state space explosion (see for instance [KNP06]). 
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Significant Diagnostic 
Counterexample Generation 



In this chapter, we present a novel technique for counterexam- 
ple generation in probabilistic model checking of Markov Chains 
and Markov Decision Processes. (Finite) paths in counterexam- 
ples are grouped together in witnesses that are likely to provide 
similar debugging information to the user. We list five prop- 
erties that witnesses should satisfy in order to be useful as de- 
bugging aid: similarity, accuracy, originality, significance, and 
finiteness. Our witnesses contain paths that behave similarly 
outside strongly connected components. Then, we show how to 
compute these witnesses by reducing the problem of generating 
counterexamples for general properties over Markov Decision 
Processes, in several steps, to the easy problem of generating 
counterexamples for reachability properties over acyclic Markov 
Chains. 
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5.1 Introduction 

Model checking is an automated technique that, given a finite-state model 
of a system and a property stated in an appropriate logical formalism, 
systematically checks the validity of this property. Model checking is a 
general approach and is applied in areas like hardware verification and 
software engineering. 

Nowadays, the interaction geometry of distributed systems and network 
protocols calls for probabilistic, or more generally, quantitative estimates 
of, e.g., performance and cost measures. Randomized algorithms are in- 
creasingly utilized to achieve high performance at the cost of obtaining 
correct answers only with high probability. For all this, there is a wide 
range of models and applications in computer science requiring quantita- 
tive analysis. Probabilistic model checking allows to check whether or not 
a probabilistic property is satisfied in a given model, e.g., "Is every message 
sent successfully received with probability greater or equal than 0.99?". 

A major strength of model checking is the possibility of generating di- 
agnostic information in case the property is violated. This diagnostic in- 
formation is provided through a counterexample showing an execution of 
the model that invalidates the property under verification. Besides the 
immediate feedback in model checking, counterexamples are also used in 
abstraction-refinement techniques [CGJ'^00], and provide the foundations 
for schedule derivation (see, e.g., [BLR05, Feh02]). 

Although counterexample generation was studied from the very begin- 
ning in most model checking techniques, this has not been the case for 
probabilistic model checking. Only recently [AHL05, And06, AL06, HK07a, 
HK07b, AL09] attention was drawn to this subject, fifteen years after the 
first studies on probabilistic model checking. Contrarily to other model 
checking techniques, counterexamples in this setting are not given by a 
single execution path. Instead, they are sets of executions of the system 
satisfying a certain undesired property whose probability mass is higher 
than a given bound. Since counterexamples are used as a diagnostic tool, 
previous works on counterexamples have presented them as sets of finite 
paths with probability large enough. We refer to these sets as represen- 
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tative counterexamples. Elements of representative counterexamples with 
high probability have been considered the most informative since they con- 
tribute mostly to the property refutation. 

A challenge in counterexample generation for probabilistic model check- 
ing is that (1) representative counterexamples are very large (often infinite), 
(2) many of its elements have very low probability (which implies that they 
are very distant from the counterexample), and (3) that elements can be 
extremely similar to each other (consequently providing similar diagnos- 
tic information). Even worse, (4) sometimes the finite paths with highest 
probability do not indicate the most likely violation of the property under 
consideration. 

For example, look at the Markov Chain T> in Figure 5.1. The property 
V h=<0 5 OV' stating that execution reaches a state satisfying -0 (i.e., reaches 
S3 or S4) with probability lower or equal than 0.5 is violated (since the 
probability of reaching ip is 1). The left hand side of table in Figure 5.2 
lists finite paths reaching ip ranked according to their probability. Note 
that finite paths with highest probability take the left branch in the sys- 
tem, whereas the right branch in itself has higher probability, illustrating 
Problem 4. To adjust the model so that it does satisfy the property (bug 
fixing), it is not sufficient to modify the left hand side of the system alone; 
no matter how one changes the left hand side, the probability of reaching if: 
remains at least 0.6. Furthermore, the first six finite paths provide similar 
diagnostic information: they just make extra loops in si. This is an exam- 
ple of Problem 3. Additionally, the probability of every single finite path 
is far below the bound 0.5, making it unclear if a particular path is impor- 
tant; see Problem 2 above. Finally, the (unique) counterexample for the 
property T) \=^-^ ()ip consists of infinitely many finite paths (namely all finite 
paths of P); see Problem 1. To overcome these problems, we partition a 
representative counterexample into sets of finite paths that follow a similar 
pattern. We call these sets witnesses. To ensure that witnesses provide 
valuable diagnostic information, we desire that the set of witnesses that 
form a counterexample satisfies several properties: two different witnesses 
should provide different diagnostic information (solving Problem 3) and el- 
ements of a single witness should provide similar diagnostic information. 
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Figure 5.1: Markov Chain 
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as a consequence witnesses have a high probabiUty mass (solving Problems 
2 and 4), and the number of witnesses of a representative counterexample 
should be finite (solving Problem 1). 

In our setting, witnesses consist of paths that behave the same outside 
strongly connected components. In the example of Figure 5.1, there are 
two witnesses: the set of all finite paths going right, represented by [S0S2S4] 
whose probability (mass) is 0.6, and the set of all finite paths going left, 
represented by [sqSiSs] with probability (mass) 0.4. 

In this chapter, we show how to obtain such sets of witnesses for bounded 
probabilistic LTL properties on Markov Decision Processes (MDP). In 
fact, we first show how to reduce this problem to finding witnesses for up- 
per bounded probabilistic reachability properties on discrete time Markov 
Chains (MCs). The major technical matters lie on this last problem to 
which most of the chapter is devoted. 

In a nutshell, the process to find witnesses for the violation of V \=^^ ()jp, 
with V being an MC, is as follows. We first eliminate from the original MC 
all the "uninteresting" parts. This proceeds as the first steps of the model 
checking process: make absorbing all states satisfying ?/^, and all states that 
cannot reach ?/^, obtaining a new MC P^. Next reduce this last MC to an 
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acyclic MC Ac(P^) in which all strongly connected components have been 
conveniently abstracted with a single probabilistic transition. The original 
and the acyclic MCs are related by a mapping that, to each finite path in 
Ac(P^) (that we call rail), assigns a set of finite paths behaving similarly 
in D (that we call torrent). This map preserves the probability of reaching 
-i/; and hence relates counterexamples in Ac(P^) to counterexamples in V. 
Finally, counterexamples in Ac{'D^) are computed by reducing the problem 
to a shortest path problem, as in [HK07a]. Because Ac(I?^) is acyclic, 
the complexity is lower than the corresponding problem in [HK07a]. 

It is worth mentioning that our technique can also be applied to pCTL 
formulas without nested path quantifiers. 

Looking ahead, Section 5.2 presents the necessary background on Markov 
Chains (MC), Markov Decision Processes (MDP), and Linear Temporal 
Logic (LTL). Section 5.3 presents the definition of counterexamples and 
discusses the reduction from general LTL formulas to upper bounded prob- 
abilistic reachability properties, and the extraction of the maximizing MC 
in an MDP. Section 5.4 discusses desired properties of counterexamples. In 
Sections 5.5 and 5.6 we introduce the fundamentals on rails and torrents, 
the reduction of the original MC to the acyclic one, and our notion of signif- 
icant diagnostic counterexamples. Section 5.7 then presents the techniques 
to actually compute counterexamples. In Section 5.8 we discuss related 
work and give final conclusions. 

5.2 Preliminaries 

We now recall the notions of Markov Decision Processes, Markov Chains, 
and Linear Temporal Logic. 

5.2.1 Markov Decision Processes 

Markov Decision Processes (MDPs) constitute a formalism that combines 
nondeterministic and probabilistic choices. They are an important model 
in corporate finance, supply chain optimization, system verification and 
optimization. There are many slightly different variants of this formalism 
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such as action-labeled MDPs [Bel57, FV97], probabilistic automata [SL95, 
SdV04]; we work with the state-labeled MDPs from [BdA95]. 

Definition 5.2.1. Let 5 be a finite set. A probability distribution on S is 
a function p: 5 — )■ [0,1] such that ^^g^gPis) = 1. We denote the set of 
all probability distributions on S by Distr(S'). Additionally, we define the 
Dirac distribution on an element s G 5 as l^, i.e., ls{s) = 1 and ls(t) = 
for all t G 5 \ {s}. 

Definition 5.2.2. A Markov Decision Process (MDP) is a quadruple M = 
(5, L, r), where 

• 5 is the finite state space; 

• So £ -S* is the initial state; 

• L is a labeling function that associates to each state s G S" a set L{s) 
of propositional variables that are valid in s; 

• r: S — 7> p(Distr(5)) is a function that associates to each s G 5 a 
non-empty and finite subset of Distr(S') of probability distributions. 

Definition 5.2.3. Let M. = (S, sq, r, L) be an MDP. We define a successor 
relation S'^SyiS hy 5 = {(s,t)|3 7r G r(s) . 7r(t) > 0} and for each state 
s G 5 we define the sets 

Paths(A1,s) = {totit2 . . . G 5"^|to = s A Vn G N . (5(t„,t„+i)} and 
Paths*(A4,s) = {tQti ...tn^ S*\to = s A VO < f < n . 6{tn,tn+i)} 

of paths of V and finite paths of D respectively beginning at s. We usually 
omit A4 from the notation; we also abbreviate Paths(Al,so) as Paths(AI) 
and Paths*(A^, So) as Paths*(A^). For u G Paths(s), we write the (n+l)-st 
state of w as ujn- As usual, we let Bs ^ p(Paths(s)) be the Borel cr-algebra on 
the cones (to . . . tn) = {w G Paths(s)|a;o = to A . . . A a;„ = t„}. Additionally, 
for a set of finite paths A C Paths* (s), we define (A) = Uo-(=a(^)- 

Figure 5.3 shows an MDP. Absorbing states (i.e., states s with r(s) = 
{Is}) are represented by double lines. This MDP features a single nonde- 
terministic decision, to be made in state sq, namely vri and tt2- 



5.2. Preliminaries 



135 




{v} {„,^-} {,.} 



Figure 5.3: Markov Decision Process 

Definition 5.2.4. Let M = {S,so,t,L) be an MDP, s e S and AQ S. 
We define the sets of paths and finite paths reaching A from s as 

Reach{M,s,A) = {u £ Paths(A^,s) | 3i>o.uji £ A} and 
Reach*(A4, s, A) = {a £ Paths*(A4, s) \ last(cr) £ AA Vj<|^|_i.cri ^ A} 

respectively. Note that Reach* {A4 , s , A) consists of those finite paths a 
starting on s reaching A exactly once, at the end of the execution. It is 
easy to check that these sets are prefix free, i.e. contain finite paths such 
that none of them is a prefix of another one. 

5.2.2 Schedulers 

Schedulers (also called strategies, adversaries, or policies) resolve the non- 
deterministic choices in an MDP [PZ93, Var85, BdA95]. 

Definition 5.2.5. Let Ai = {S, sq,t, L) be an MDP. A scheduler t] on 
is a function from Paths*(A1) to Distr(p(Distr(S'))) such that for all 
a G Paths*(A4) we have r]{a) G Distr(r(last(cj))). We denote the set of ah 
schedulers on M by Sch(A^). 
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Note that our schedulers are randomized, i.e., in a finite path a a sched- 
uler chooses an element of r(last(cr)) probabilistically. Under a scheduler 
?7, the probability that the next state reached after the path a is t, equals 
S7rGr(iast(o-)) " '^i'^)- In this Way, a scheduler induces a probability 

measure on Bg as usual. 

Definition 5.2.6. Let A4 = {S, so,t, L) be an MDP and r] a scheduler on 
Ai. We define the probability measure P,, as the unique measure on Bsq 
such that for all sqSi Paths* 

P^((soSi---Sn.)) = ^ r]{soSi...Si){7r) -Trisi+i). 

i=0 7reT(si) 

We now recall the notions of deterministic and memory less schedulers. 

Definition 5.2.7. Let A4 be an MDP and r] a scheduler on A4. We say 
that 1] is deterministic if r]{a){'Ki) is either or 1 for all VTj € r(last((T)) and 
all a € Paths* (A^). We say that a scheduler is memoryless if for all finite 
paths cri,a2 of ^A with last(cri) = Iast(cr2) we have 7?(cri) = ??(o'2)- 

Definition 5.2.8. Let A4 be an MDP and A € Bsq- Then the maximal 
probability F'^ and minimal probability P~ of A are defined by 

P+(A) = sup P„(A) and P"(A) = inf P„(A). 

»)gSch(A^) r?eSch(A4) 

A scheduler that attains P^(A) or P~(A) is called a maximizing or mini- 
mizing scheduler respectively. 

5.2.3 Markov Chains 

A (discrete time) Markov Chain is an MDP associating exactly one prob- 
ability distribution to each state. In this way nondeterministic choices are 
no longer allowed. 

Definition 5.2.9 (Markov Chain ). Let M = (5,so,r,L) be an MDP. If 
|r(,s)| = 1 for all s £ S, then we say that 7W is a Markov Chain (MC). 
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In order to simplify notation we represent probabilistic transitions on 
MCs by means of a probabilistic matrix V instead of r. Additionally, we 
denote by P^, ^ the probability measure induced by a MC T> with initial 
state s and we abbreviate P^, as P.^ ■ 

5.2.4 Linear Temporal Logic 

Linear temporal logic (LTL) [MP91] is a modal temporal logic with modal- 
ities referring to time. In LTL is possible to encode formulas about the 
future of paths: a condition will eventually be true, a condition will be true 
until another fact becomes true, etc. 

Definition 5.2.10. LTL is built up from the set of propositional variables 
V, the logical connectives A, and a temporal modal operator by the 
following grammar: 

(t)::=V \^<i)\(l) ^4)\ (t)U(t). 

Using these operators we define V, 0, and □ in the standard way. 

Definition 5.2.11. Let M = (S, so,t, L) be an MDP. We define satisfia- 
bility for paths cj in A1, propositional variables v ^V, and LTL formulas 
(j), 7 inductively by 

^ hvf ^ <^ w G L{ujq) cj h=A4 '/"^ 7 ^ w Nm and (J h=A^ 7 

^ Km ^ not(tJ 0) uj^j^4)U^ <^ 3i>o.a;4i ^^7 and 

Vo<j<i.W4.j 1=^ </> 

where U]^i is the z-th suffix of w. When confusion is unlikely, we omit the 
subscript Ad on the satisfiability relation. 

Definition 5.2.12. Let M be an MDP. We define the language Sat^((/>) 
associated to an LTL formula (j) as the set of paths satisfying i.e. Sat^ ((/>) 
= {w E Paths(7V4) I cj !=</>}. Here we also generally omit the subscript M.. 

We now define satisfiability of an LTL formula (j) on an MDP M . We say 
that M satisfies (j) with probability at most p {A4 (p) if the probability 
of getting an execution satisfying (j) is at most p. 
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Definition 5.2.13. Let M be an MDP, (j) an LTL formula and p G [0, 1]. 

We define \=_^^ and \=^^ by 

M '/'^IP+(Sat((/.)) <p, 
M </'^F^(Sat(</))) >p. 

We define M 1=^^ 4> and A4 \=^p (p iii ^ similar way. In case the MDP 
is fully probabilistic, i.e., an MC, the satisfiability problem is reduced to 
M h<p 'P^^j^iS3t{(j)))cxp, where t<e {<,<,>,>}. 

5.3 Counterexamples 

In this section, we define what counterexamples are and how the problem of 
finding counterexamples to a general LTL property over Markov Decision 
Processes reduces to finding counterexamples to reachability problems over 
Markov Chains. 

Definition 5.3.1 (Counterexamples). Let be an MDP and (j) an LTL 
formula. A counterexample to M 4> is a measurable set C C Sat((/>) 
such that P"^(C) > p. Counterexamples to A4 \=_^p (p a-re defined similarly. 

Counterexamples to ^A \=^^ (p and ^A \=^^ 4> cannot be defined straight- 
forwardly as it is always possible to find a set C C Sat(i;^) such that 
IP (C) < p or P~(C) < p, note that the empty set trivially satisfies it. 
Therefore, the best way to find counterexamples to lower bounded proba- 
bilities is to find counterexamples to the dual properties M \=^i_f^(p and 
M \=^^_^(t). That is, while for upper bounded probabilities, a counterex- 
ample is a set of paths satisfying the property with mass probability beyond 
the bound, for lower bounded probabilities the counterexample is a set of 
paths that does not satisfy the property with sufficient probability. 

Example 5.3.1. 5.3.1 Consider the MDP M of Figure 5.4 and the LTL 

formula ()v. It is easy to check that M. ()v. The set C = Sat(O'y) = 

{pGPaths(so)|3i>o.p = so(si)'(s4r}U{pGPaths(so)|3i>o.p = so(s3y(s5r} 
is a counterexample. Note that P^(C) = 1 where r/ is any deterministic 
scheduler on M. satisfying ry(so) = t^i- 
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Figure 5.4: 



LTL formulas are actually checked by 
reducing the model checking problem to 
a reachability problem [dAKM97]. For 
checking upper bounded probabilities, the 
LTL formula is translated into an equiv- 
alent deterministic Rabin automaton and 
composed with the MDP under verifica- 
tion. On the obtained MDP, the set of 
states forming accepting end components 
(SCC that traps accepting conditions with 
probability 1) are identified. The maxi- 
mum probability of the LTL property on 
the original MDP is the same as the max- 
imum probability of reaching a state of an 

accepting end component in the final MDP. Hence, from now on we will fo- 
cus on counterexamples to properties of the form ^A |=< ()tp or \=^^ ()ip, 
where tp is a prepositional formula, i.e., a formula without temporal oper- 
ators. 

In the following, it will be useful to identify the set of states in which a 
prepositional property is valid. 

Definition 5.3.2. Let M. be an MDP. We define the state language 
SatMii^) associated to a prepositional formula ip as the set of states sat- 
isfying ip, i.e., Sat>i('i/') = {s € S I s ^ where |= has the obvious 
satisfaction meaning for states. As usual, we generally omit the subscript 
M. 



We will show now that, in order to find a counterexample to a property 
in an MDP with respect to an upper bound, it suffices to find a counterex- 
ample for the MC induced by the maximizing scheduler. The maximizing 
scheduler turns out to be deterministic and memoryless [BdA95]; conse- 
quently the induced Markov Chain can be easily extracted from the MDP 
as follows. 



Definition 5.3.3. Let M = (S, so,t, L) be an MDP and rj a deterministic 
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memory less scheduler. Then we define the MC induced by r/ as = 
{S, SQjVr], L) where Vr^{s,t) = for all s,t £ S. 

Now we state that finding counterexamples to upper bounded proba- 
bilistic reachability LTL properties on MDPs can be reduced to finding 
counterexamples to upper bounded probabilistic reachability LTL proper- 
ties on MCs. 

Theorem 5.3.4. Let M be an MDP, ip a propositional formula and p G 
[0, 1]. Then, there is a maximizing (deterministic memoryless) scheduler rj 
such that \=^^ OV' -^j? I=<p ^V'- Moreover, if C is a counterexample 
to ^Ar| OV' then C is also a counterexample to ^4 \=^^ OV'- 

Note that r] can be computed by solving a linear minimization problem 
[BdA95]. See Section 5.7.1. 

5.4 Representative Counterexamples, Partitions 
and Witnesses 

The notion of counterexample from Definition 5.3.1 is very broad: just an 
arbitrary (measurable) set of paths with high enough mass probability. To 
be useful as a debugging tool (and in fact to be able to present the coun- 
terexample to a user), we need counterexamples with specific properties. 
We will partition counterexamples (or rather, representative counterexam- 
ples) in witnesses and list five informal properties that we consider valuable 
in order to increase the quality of witnesses as a debugging tool. 

We first note that for reachability properties it is sufficient to consider 
counterexamples that consist of finite paths. 

Definition 5.4.1 (Representative counterexamples). Let M be an MDP, 
•0 a propositional formula and p € [0, 1]. A representative counterexample 
to M h=<p OV' is a set C C Reach*(Al, Sat(i/;)) such that P+((C)) > p. 
We denote the set of all representative counterexamples to ^A \=^^ ()ip by 

n{M,p,i>). 
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Observation 5.4.1. Let M be an MDP, tp a prepositional formula and p € 
[0, 1]. If C is a representative counterexample to M. |=< ()tp, then (C) is a 
counterexample to A4 \=^^ Furthermore, there exists a counterexample 
to Ai ()ip if and only if there exists a representative counterexample to 

Following [HK07a], we present the notions of minimum counterexample, 
strongest evidence and most indicative counterexamples. 

Definition 5.4.2 (Minimum counterexample). Let V be an MC, a prepo- 
sitional formula and p G [0, 1]. We say that C G 71(1), p,ip) is a minimum 
counterexample if \C\ < \C'\, for all C G TZ{'D,p,ip). 

Definition 5.4.3 (Strongest evidence). Let V be an MC, ip a preposi- 
tional formula and p G [0, 1]. A strongest evidence to D Oip is a 
finite path a G Reach*(P, Sat(V')) such that F^{{a)) > f^{{p)), for all 
p G Reach*(P,Sat(V')). 

Definition 5.4.4 (Most indicative counterexample). Let T> be an MC, ip 
a prepositional formula and p G [0,1]. We call C G TZ{T>,p,'tlj) a most 
indicative counterexample if it is minimum and 1P^((C)) > Pj,((C)), for all 
minimum counterexamples C G TZ{'D,p,'tp). 

Unfortunately, very often most indicative counterexamples are very 
large (even infinite), many of its elements have insignificant measure and el- 
ements can be extremely similar to each other (consequently providing the 
same diagnostic information). Even worse, sometimes the finite paths with 
highest probability do not exhibit the way in which the system accumu- 
lates higher probability to reach the undesired property (and consequently 
where an error occurs with higher probability). For these reasons, we are 
of the opinion that representative counterexamples are still too general in 
order to be useful as feedback information. We approach this problem by 
refining a representative counterexample into sets of finite paths following 
a "similarity" criteria (introduced in Section 5.5). These sets are called 
witnesses of the counterexample. 
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Recall that a set Y of nonempty sets is a partition of X if the elements of 
Y cover X and are pairwise disjoint. We define counterexample partitions 
in the following way. 

Definition 5.4.5 (Counterexample partitions and witnesses). Let M be 
an MDP, a propositional formula, p G [0, 1], and C a representative coun- 
terexample to Ai A counterexample partition Wc is a partition of 
C. We call the elements of Wc witnesses. 

Since not every partition generates useful witnesses (from the debugging 
perspective), we now state five informal properties that we consider valuable 
in order to improve the diagnostic information provided by witnesses. In 
Section 5.7 we show how to partition the representative counterexample in 
order to obtain witnesses satisfying most of these properties. 

Similarity: Elements of a witness should provide similar debugging 
information. 

Accuracy: Witnesses with higher probability should exhibit evolu- 
tions of the system with higher probability of containing errors. 

Originality: Different witnesses should provide different debugging 
information. 

Significance: Witnesses should be as closed to the counterexample 
as possible (their mass probability should be as closed as possible to 
the bound p). 

Finiteness: The number of witnesses of a counterexample partition 
should be finite. 

5.5 Rails and Torrents 

As argued before we consider that representative counterexamples are ex- 
cessively general to be useful as feedback information. Therefore, we group 
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finite paths of a representative counterexample in witnesses if they are "sim- 
ilar enough". We will consider finite paths that behave the same outside 
SCCs of the system as providing similar feedback information. 

In order to formalize this idea, we first reduce the original MC T> to 
an acyclic MC preserving reachability probabilities. We do so by removing 
all SCCs K of V keeping just input states of K. In this way, we get a new 
acyclic MC denoted by Ac(X'). The probability matrix of the Markov Chain 
relates input states of each SCC to its output states with the reachability 
probability between these states in T>. Secondly, we establish a map between 
finite paths a in /Kc{T>) (rails) and sets of paths in T> (torrents). Each 
torrent contains finite paths that are similar, i.e., behave the same outside 
SCCs. We conclude the section showing that the probability of a is equal 
to the mass probability of W^- 

Reduction to Acyclic Markov Chains 

Consider an MC V = (S, so,V, L). Recall that a subset K C 5 is called 
strongly connected if for every s,t € K there is a finite path from s to t. 
Additionally K is called a strongly connected component (SCC) if it is a 
maximally (with respect to C) strongly connected subset of S. 

Note that every state is a member of exactly one SCC of V; even those 
states that are not involved in cycles, since the trivial finite path s connects 
s to itself. We call trivial strongly connected components to the SCCs con- 
taining absorbing states or states not involved in cycles (note that trivial 
SCCs are composed by one single state). From now on we let SCC* be the 
set of non trivial strongly connected components of an MC. 

A Markov Chain is called acyclic if it contains only trivial SCCs. Note 
that an acyclic Markov Chain still has absorbing states. 

Definition 5.5.1 (Input and Output states). Let V = (S,so,V, L) be an 
MC. Then, for each SCC* K of P, we define the sets Inpj^ C 5 of all states in 
K that have an incoming transition from a state outside of K and OutK ^ S 
of all states outside of K that have an incoming transition from a state of 
K in the following way 
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InpK = {t€K|3sG5\K .Vis, t) > 0}, 
OutK = {s G 5 \ K I 3t G K.7'(t,s) > 0}. 




Output States 



We also define for each SCC* K an MC related to K as Pk — (K U OutK, sk, 
Pki^k) where sk is any state in inp^, L\(^{s) = L{s), and VK{s,t) is equal 
to V(s,t) if s G K and equal to 1^ otherwise. Additionally, for every state 
s involved in non trivial SCCs we define SCC^ as Pk, where K is the SCC* 
of V such that s G K. 

Now we are able to define an acyclic MC Ac{T>) related to P. 

Definition 5.5.2. Let V = {S,so,V,L) be a MC. We define Ac{V) = 
{S',so,V',L') where 

Scorn '''inp 

.S'^S\ U KU U I"PK' 

kgScc* Kescc* 

s' ' 

V{s,t) if S £ Scorn, 

P^^(Reach(SCC+,s,{t})) if s G SinpAte Outscc+, 
Is if S G Sinp A Outsc(-+ = 0, 

otherwise. 



Note that Ac(P) is indeed acyclic. 

Example 5.5.1. Consider the MC T> of Figure 5.5(a). The strongly con- 
nected components of V are Ki = {51,53,84,57}, K2 = {55,56,53} and the 
singletons {50}, {52}, {59}, {510}, {5ii}, {512}, {513}, and {514}. The input 
states of Ki are Inp^j = {51} and its output states are OutKi = {59,510}. 
For K2, InpKj = {55,50} and 0utK2 = {511,514}. The reduced acyclic MC 
of T> is shown in Figure 5.5(b). 
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(a) Original MC (b) Derived Acyclic MC 

Figure 5.5: 

Rails and Torrents 

We now relate (finite) paths in /Kc{T>) (rails) to sets of paths in T) (torrents). 

Definition 5.5.3 (Rails). LetPbeanMC. A finite path a G Paths*(Ac(P)) 
will be called a rail of D. 

Consider a rail a, i.e., a finite path of Ac{T>). We will use a to represent 
those paths uj of T> that behave "similar to" a outside SCCs of T>. Naively, 
this means that cr is a subsequence of u. There are two technical subtleties 
to deal with: every input state in a must be the first state in its SCC in u 
(freshness) and every SCC visited by uj must be also visited by a (inertia) 
(see Definition 5.5.5). We need these extra conditions to make sure that no 
path UJ behaves "similar to" two distinct rails (see Lemma 5.5.7). 

Recall that given a finite sequence a and a (possible infinite) sequence 
u), we say that a is a subsequence of denoted hj a Q oj, if and only if there 
exists a strictly increasing function / : {0, 1, . . . , |(t| — 1} — > {0, 1, . . . , |a;| — 1} 
such that Vo<j<|o-|-O'i = ^f{i)- If ^ is an infinite sequence, we interpret the 
codomain of / as N. In case / is such a function we write a Qf lo. 

Definition 5.5.4. Let V = {S, so,V, L) be an MC. On S we consider the 
equivalence relation satisfying s ~ t if and only if s and t are in the 
same strongly connected component. Again, we usually omit the subscript 
D from the notation. 
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The following definition refines the notion of subsequence, taking care 
of the two technical subtleties noted above. 

Definition 5.5.5. Let D = (S, sqjV, L) be an MC, uj a (finite) path of P, 
and a € Paths*(Ac(I')) a finite path of Ac{V). Then we write a ^ co ii 
there exists / : {0, 1, . . . , |(t| — 1} — t- N such that a Qf oj and 

Vo<j</-(j) : i^f{i) ^j', for alH = 0, 1, . . . \a\ — 1, {Freshness property} 
^ f(i)<j<f(i+i) ■ ^f{i) ~ ^j'^ fo'^ alH = 0, 1, . . . |it| — 2. {Inertia property} 

In case / is such a function we write a <f oj. 

Example 5.5.2. Let V = {S,so,V,L) be the MC of Figure 5.5(a) and 
take a = soS2SqSi4. Then for all i G N we have a where Ui = 

soS2SQ{s5SsSQysu and /i(0) = 0, fi{l) = 1, fi{2) = 2, and /i(3) = 3 + 3i. 
Additionally, a s^s^s^S'^s^sx!^ since for all / satisfying o" s^s^s^s^s^sx/^ 
we must have /(2) = 5; this implies that / does not satisfy the freshness 
property. Finally, note that a sqS2S%s\\S\/x since for all / satisfying 
c soS2S6'SiiSi4 we must have /(2) = 2; this implies that / does not 
satisfy the inertia property. 

We now give the formal definition of torrents. 

Definition 5.5.6 (Torrents). Let T> = {S, sqjV, L) be an MC and a a 

sequence of states in S. We define the function Torr by 

Torr(D,cj) = {u e Paths(P) \a ^u}. 
We call Torr(X', a) the torrent associated to a. 

We now show that torrents are disjoint (Lemma 5.5.7) and that the 
probability of a rail is equal to the probability of its associated torrent 
(Theorem 5.5.10). For this last result, we first show that torrents can be 
represented as the disjoint union of cones of finite paths. We call these 
finite paths generators of the torrent (Definition 5.5.8). 
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Lemma 5.5.7. Let V be an MC. For every a-,p ^ Paths* (Ac(D)) we have 

(J Torr(P, cr) n Torr(P, p) = 0. 

Definition 5.5.8 (Torrent Generators). Let P be an MC. Then we define 
for every rail a G Paths* (Ac(P)) the set 

TorrGen(P,a) = {p e Paths*(r') \3f -.a p/\f{\(T\ - 1) = \p\ - 1}. 

In the example from the Introduction (see Figure 5.1), so-siss and sqS2S4^ 
are rails. Their associated torrents are, respectively, {sqSis'^ \ n € N*} and 
{S0S2S4 I n € N*} (note that S3 and S4 are absorbing states), i.e. the paths 
going left and the paths going right. The generators of the first torrent are 
{sqSiSs I n G N*} and similarly for the second torrent. 

Lemma 5.5.9. Let V be an MC and a G Paths* (Ac(D)) a rail of V. Then 
we have 

JoniV,a)= 1+J ip). 

pGTorrGen(I?,(T) 

Proof. The proof is by cases on the length of cr. We prove the result for the 
cases on which a is of the form ats, with t an input state and s an output 
state, the other cases are simpler. In order to proof this lemma, we define 
(for each ast of the above form) the following set of finite paths 

A^ts = {ptail(7r) I p G TorrGen(o-t) and vr G Paths*(SCQ+, t, {s})} (5.1) 

Checking that Torr(D, cr) = lilpGA^ts ^'^^ straightforward. We now 
focus on proving that 

A^st = TorrGen(X',c7). (5.2) 

For that purpose we need the following two observations. 

Observation 5.5.3. Let P be a MC. Since Ac(P) is acyclic we have 
o"i 9^ CTj for every a G Paths* (Ac(D)) and i ^ j (with the exception of 
absorbing states). 
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Observation 5.5.4. Let o", w and / be such that a oj. Then Vi : 3j : 
ojj ~ (Tj. This fonows from a Qf co and the mertia property. 

We now proceed to prove that Ag-sj = TorrGen(D, cr). 
( 5 ) Let popi---pk £ TorrGen(o"ts) and nt the lowest subindex of p 
such that pnt = t. Take p = popi - ■ ■ Pm ^-^id tt = Pm • • • Pk (Note that 
PoPi - ■ ■ Pk = ptail(7r)). In order to prove that popi - • • pk € Ag-ts we need to 
prove that 

(1) p € TorrGen((7i), and 

(2) vr e Paths''(SCC+,t,{s}). 

(1) Let / be such that ats PoPi---pk and f{\ats\ — 1) = k. Take 
5 : {0, 1, ... , \at\ — 1} — N be the restriction of /. It is easy to check 
that at :<g p. Additionahy f{\at\ — 1) = nt (otherwise / would not 
satisfy the freshness property for i = \at\ — 1). Then, by definition of 
g, we have fi'do'tl — 1) = nt. 

(2) It is clear that vr is a path from t to s. Therefore we only have to show 
that every state of vr is in SCC^. By definition of SCC^, ttq = t G 
SCC^ and s € SCC^ since s € Out2^(-+. Additionally, since / satisfies 
inertia property we have that y fQ^t\-i)<j<f{\ats\-i) ■ Pf{\at\-i) ~ Pj, 
since f{\(7t\ — 1) = nt and vr = p„j ■ ■ ■ Pk we have Vo<j<|7r|_i : i ~ vr^ 
proving that ttj G SCC^ for j G {1, • • • , Ivrj — 2}. 

( C ) Take p € TorrGen(crt) and tail(7r) G Paths*(SCCi+, t, {s}). In order 
to prove that pta\\{-K) £ TorrGen((Tts) we need to show that there exists a 
function g such that: 

(1) ats ■<g ptail(7r), 

(2) g{\ats\-l) = \pt^\\{7:)\-l. 

Since p G TorrGen((Tt) we know that there exists / be such that at <f p 
and f{\at\ - 1) = \p\ - 1. We define 5 : {0, 1, ... , \ats\ - 1} ^ {0, 1, ... , 
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|ptail(7r)| -1} by 

9{i) 



f{i) if i < \ats\ - 1, 

|ptail(7r)| - 1 ifi= \ats\-l. 



(1) It is easy to check that ats Qg /9tail(7r). Now we will show that g 
satisfies Freshness and Inertia properties. 

Freshness property: We need to show that for all < i < |cjts| we have 
Vo<j<g(j) : ptail(7r)g(j) ptail(7r)j. For the cases i G {0, . . . , \at\ — 1} 
this holds since at p and definition of g. 

Consider i = \ats\ — 1, in this case we have to prove Vo<j<|ptaii(7r)|-i • 
Ptail(7r)|ptail(7r)hi) / ptail(7r)j or equivalently Vo<j<|ptail(7r)hi : s / 
ptail(7r)j . 

Caseje{|p|,...|ptail(^)|-1}. 

From vr € Paths*(SCC^, t, {s}) and s G Outg^^+ it is easy to see 

Vo<j<|taii(7r)hi : s / tail(7r)j 
Case i G {0, . . . , \p\ - 1}. 

From ats G Paths*(Ac(I')) and Observation 5.5.3 we have that 
Vo<j<|(jf|_i : s 9^ atj. Additionally, at <f p and definition of 
g and Observation 5.5.4 imply Vo<j<|p| : s ^ pj or equivalently 
Vo<j<|p| : s /)tail(7r)j. 

Inertia property: Since vr G Paths*(5CCj'^, t, {s}) we have Vo<j<|7r|_i : 
t ~ TTj which implies that V|p|_i<j<|ptaii(7r)|-i : ptail(7r)|p|_i ~ ptail(7r)j 
or equivalently g{\a\~\)<j<g(\as\-\) ■ pt3H'^)g{\p\^i) ~ pta\\{TT)j show- 
ing that g satisfies the inertia property. 

(2) Follows from the definition of g. □ 

Theorem 5.5.10. Let V be an MC. Then for every rail a G Paths* (Ac(P)) 
we have 

P,^,^,{{^))=^J^on{V,a)). 
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Proof. By induction on the structure of a. 

Base Case: Note tliat Pac(©)((so)) = IPAc(©)(P3ths(Ac(D), sq)) = 1 , and 
similarly 1 = Pp(Paths(r', sq)) = Pp(Torr(so)). 

Inductive Step: Let t be such that last(cr) = t. Suppose that t G S'com and 
denote by Ac('P) to the probability matrix of Ac(X'). Then 

= Pj,(Torr(a)).p(t,s) 

{Inductive Hypothesis and definition of V} 
= ^vi^peTonGen(a){p))-nt,s) {Lem. 5.5.9} 

= EpeTorrGenHlPp((p))-ff^p((*«)) 
= EpeTorrGenHlPl>((/'tail(*^))) 
5-^pGTorrGen((Ts) ^t> 

m 

= P-D(i±lpGTorrGen(<Ts)(p)) 

= P^(Torr(o-s)) {Lem. 5.5.9} 

Now suppose that t £ Sinp, then 

= IPM^)((^))-Ac(7')(t,s) 

= P^(Torr(a))-Ac(7')(t,s) {HI} 
^(l±)peTorrGenH(P)) " Ac(P)(t,5) {Lem. 5.5.9} 

EpeTorrGenHff^^((p)))-Ac(P)(t,s) 
EpeTorrGenM • IP.,(Paths(SCC+ t, {S})) 

{By definition of Ac{V) and distributivity} 

I^peTorrGen((T) ^P'-d((/')) " I^7rePaths*(SCC+,t,{s}) ^t},M'^)) 
^-'pSTorrGen((T),7rG Paths* (SCCj^,t,{s}) '^d 

((ptaiKvr))) {Dfn. P} 
I:P.A..IP.((P)) {(5-1)} 

EpeTorrGen(..)IP^((p)) {(5-2)} 
Pl,(l±lpGTorrGen((Ts)(p)) 

P^(Torr(as)) □ 
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5.6 Significant Diagnostic Counterexamples 

So far we have formalized the notion of paths behaving similarly (i.e., be- 
having the same outside SCCs) in an MC V by removing all SCC of "D, ob- 
taining /Kc{T>). A representative counterexample to Ac(P) \=^^ ()ip gives rise 
to a representative counterexample to T) in the following way: for 

every finite path a in the representative counterexample to Ac(2?) \=^^ O'fp 
the set TorrGen(I', a) is a witness, then we obtain the desired representative 
counterexample to V \=^^ ()ip by taking the union of these witnesses. 

Before giving a formal definition, there is still one technical issue to 
resolve: we need to be sure that by removing SCCs we are not discarding 
useful information. Because torrents are built from rails, we need to make 
sure that when we discard SCCs, we do not discard rails that reach ■0- 

We achieve this by first making states satisfying ij: absorbing. Addi- 
tionally, we make absorbing states from which it is not possible to reach ip. 
Note that this does not affect counterexamples. 



Definition 5.6.1. Let T> = {S,sq,V,L) be an MC and "0 a propositional 
formula. We define the MC = (5, sq, "P^, L), with 



where Sat^(V') = {s G 5 | ^(Reach(D, s, Sat(V'))) > 0} is the set of states 
reaching ip in D. 

The following theorem shows the relation between paths, finite paths, 
and probabilities of V, D^, and Ac(P^). Most importantly, the probability 
of a rail a (in Ac(P^)) is equal to the probability of its associated torrent 
(in D) (item 5 below) and the probability of 0^ is not affected by reducing 
V to Ac('D^) (item 6 below). 

Note that a rail a is always a finite path in Ac(X'^,), but that we can 
talk about its associated torrent Ton{D^,a) in and about its associated 



1 if s Sat^('i/') As = t, 

1 if s e Sat(V') As = t, 

V{s,t) if s G Sat^(V') -Sat(V') 



otherwise 
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torrent Torr(P, cr) in D. The former exists for technical convenience; it is 
the latter that we are ultimately interested in. The following theorem also 
shows that for our purposes, viz. the definition of the generators of the 
torrent and the probability of the torrent, there is no difference (items 3 
and 4 below). 

Corollary 5.6.1. Let D = (S, sqjV, L) be an MC and ^ a propositional 
formula. Then for every a E Paths* (P^) 

1. Reach* (P^, So, Sat(^/;)) = Reach*(P, sq, Sat(V')), 

3. TorrGen(2?^, cj) = TorrGen(I>, cj), 

4. F^^{Jon{V^,a)) = FjJon{V,a)), 

6. Ac(P^) Kp 0^ if and only if V OV', for any p G [0, 1]. 

Definition 5.6.2 (Torrent-Counterexamples). Let V = {S,so,V,L) be an 
MC, tp a propositional formula, and p € [0, 1] such that V ()ip. Let C 
be a representative counterexample to Ac(V^) \=_^^ ()ip. We define the set 

TorRepCount(C) = {TorrGen(P, a) \a €C}. 

We call the set TorRepCount(C) a torrent- counterexample of C. Note that 
this set is a partition of a representative counterexample to V \=_^^ OV'- Ad- 
ditionally, we denote by TZt{T>,p, to the set of all torrent-counterexamples 
to V 0^, i.e., {TorRepCount(C) | C G 7^(Ac(X'),p, ^)}. 

Theorem 5.6.3. Let D = (S, sqjV, L) be an MC, tp a propositional for- 
mula, andp G [0, 1] such that D ()^p. Take C a representative counterex- 
ample to Ac{V^) OV'- Then the set of finite paths l+JvKeTorRepCount(C) ^ 
is a representative counterexample to T> ()ip. 
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Note that for each cr € C we get a witness TorrGen(D, a). Also note that 
the number of rails is finite, so there are also only finitely many witnesses. 

Following [HK07a], we extend the notions of minimum counterexamples 
and strongest evidence. 

Definition 5.6.4 (Minimum torrent-counterexample). Let T> be an MC, 

ip a propositional formula and p € [0, 1]. We say that Ct € TZt{'D^p,ip) is a 
minimum torrent- counterexample if \Ct\ < |C||, for all C[ G TZt{'D,p,ip). 

Definition 5.6.5 (Strongest torrent-evidence). Let V be an MC, V ^ 
propositional formula and p G [0, 1]. A strongest torrent- evidence toV 0^ 
is a torrent Torr{T>,a) such that a G Paths*(Ac(X'^)) and P-p(Torr(P, cr)) 
> Pp(Torr(X',p)) for ah p G Paths^(Ac(r'^)). 

Now we define our notion of significant diagnostic counterexamples. It 
is the generalization of most indicative counterexample from [HK07a] to 
our setting. 

Definition 5.6.6 (Most indicative torrent-counterexample). Let D be an 
MC, ip a propositional formula and p G [0, 1]. We say that Ct G Tlt{'D,p, ijj) 
is a most indicative torrent- counterexample if it is a minimum torrent- 
counterexample and IP(UrGCt (-^)) — ^O^TeC'C^)) minimum torrent- 
counterexamples C't G TZt{T>,p,ip). 

Note that in our setting, as in [HK07a], a minimal torrent-counterexample 
C consists of the \C\ strongest torrent-evidences. 

By Theorem 5.6.3 it is possible to obtain strongest torrent-evidence and 
most indicative torrent-counterexamples of an MC D by obtaining strongest 
evidence and most indicative counterexamples of Ac(T>^) respectively. 

5.7 Computing Counterexamples 

In this section we show how to compute most indicative torrent-counterexamples. 
We also discuss what information to present to the user: how to present 
witnesses and how to deal with overly large strongly connected components. 
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5.7.1 Maximizing Schedulers 

The calculation of the maximal probability on a reachability problem can 
be performed by solving a linear minimization problem [BdA95, dA97]. 
This minimization problem is defined on a system of inequalities that has a 
variable Xi for each different state Sj and an inequality Tr(sj)-Xj < Xi for 
each distribution vr € T(sj). The maximizing (deterministic memoryless) 
scheduler 77 can be easily extracted out of such system of inequalities after 
obtaining the solution, pQ, . . . ,pn are the values that minimize Xi in 
the previous system, then r/ is such that, for all Sj, r/(sj) = tt whenever 
^(-^j) ■ Pj — Pi- following we denote IP's. [OV'] — ^i. 

5.7.2 Computing most indicative torrent-counterexamples 

We divide the computation of most indicative torrent-counterexamples to 
AA |=<p OV" ill three stages: pre-processing, SCC analysis, and searching. 

Pre-processing stage. We first modify the original MC T> by making 
all states in Sat(V') U 5 \ Sat^(V') absorbing. In this way we obtain the MC 
T)^ from Definition 5.6.1. Note that we do not have to spend additional 
computational resources to compute this set, since SaX.^{il)) = {s G 5 | 
Pj,[0^] > 0} and hence all required data is already available from the LTL 
model checking phase. 

SCC analysis stage. We remove all SCCs K of keeping just input 
states of K, getting the acyclic MC Ac(X'^,) according to Definition 5.5.2. 

To compute this, we first need to find the SCCs of T>^. There exists sev- 
eral well known algorithms to achieve this: Kosaraju's, Tarjan's, Gabow's 
algorithms (among others). We also have to compute the reachability prob- 
ability from input states to output states of every SCC. This can be done 
by using steady-state analysis techniques [Cas93] . 

Searching stage. To find most indicative torrent-counterexamples in P, 
we find most indicative counterexamples in Ac(P^). For this we use the 
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same approach as [HK07a], turning the MC into a weighted digraph to 
replace the problem of finding the finite path with highest probability by a 
shortest path problem. The nodes of the digraph are the states of the MC 
and there is an edge between s and t if V{s, t) > 0. The weight of such an 
edge is - log('P(s, t)). 

Finding the most indicative counterexample in Ac(P^,) is now reduced 
to finding k shortest paths. As explained in [HKOTa], our algorithm has to 
compute k on the fly. Eppstein's algorithm [Epp98] produces the k shortest 
paths in general in 0{m + n log n + k), where m is the number of nodes and 
n the number of edges. In our case, since Ac{'D^) is acyclic, the complexity 
decreases to 0{m + k). 

5.7.3 Debugging issues 

Representative finite paths. What we have computed so far is a most 
indicative counterexample to /Kc{T)^) \=^^ Oip- This is a finite set of rails, 
i.e., a finite set of paths in Ac(X'^). Each of these paths a represents a 
witness TorrGen(X', o"). Note that this witness itself has usually infinitely 
many elements. 

In practice, one has to display a witness to the user. The obvious way 
would be to show the user the rail a. This, however, may be confusing to 
the user as a is not a finite path of the original Markov Decision Process. 
Instead of presenting the user with a, we therefore show the user the finite 
path of TorrGen(P, fj) with highest probability. 



Definition 5.7.1. Let V be an MC, and a G Paths* (Ac(P^)) a rail of V. 
We define the representant of Torr(D, a) as 



Note that given repTorr (D, a) one can easily recover a. Therefore, no 
information is lost by presenting torrents as one of its generators instead of 
as a rail. 



repTorr (T>, a) = repTorr 



pGTorrGen('D,a') 




max 

/9GTorrGen(X',o') 



nip)) 
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Expanding SCC. Note that in the Preprocessing 
stage, we reduced the size of many SCCs of the system 
(and hkely even completely removed some) by making 
states in Sat(V') U S\ Sat^ (ip) absorbing. However, It is 
possible that the system still contains some very large 
strongly connected components. In that case, a single 



c 



Figure 5.6: 



• u 
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witness could have a very large probability mass and 
one could argue that the information presented to the user is not detailed 
enough. For instance, consider the Markov Chain of Figure 5.6 in which 
there is a single large SCC with input state t and output state u. 

The most indicative torrent-counterexample to the property D \=^q ^ 
is simply {TorrGen (sin)}, i.e., a single witness with probability mass 1 asso- 
ciated to the rail stu. Although this may seem uninformative, we argue that 
it is more informative than listing several paths of the form st - ■ - u with 
probability summing up to, say, 0.91. Our single witness counterexample 
suggests that the outgoing transition to a state not reaching ^ was simply 
forgotten in the design; the listing of paths still allows the possibility that 
one of the probabilities in the whole system is simply wrong. 

Nevertheless, if the user needs more information to tackle bugs inside 
SCCs, note that there is more information available at this point. In par- 
ticular, for every strongly connected component K, every input state s of K 
(even for every state in K), and every output state t of K, the probability 
of reaching t from s is already available from the computation of Ac{T>^,) 
during the SCC analysis stage of Section 5.7.2. 



Recently, some work has been done on counterexample generation tech- 
niques for different variants of probabilistic models (Discrete Markov Chains 
and Continue Markov Chains ) [AHL05, AL06, HK07a, HK07b]. In our ter- 
minology, these works consider witnesses consisting of a single finite path. 
We have already discussed in the Introduction that the single path ap- 
proach does not meet the properties of accuracy, originality, significance. 
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and finiteness. 

Instead, our witness/torrent approach provides a high level of abstrac- 
tion of a counterexample. By grouping together finite paths that behave 
the same outside strongly connected components in a single witness, we 
can achieve these properties to a higher extent. Behaving the same outside 
strongly connected components is a reasonable way of formalizing the con- 
cept of providing similar debugging information. This grouping also makes 
witnesses significantly different from each other: each witness comes from 
a different rail and each rail provides a different way to reach the undesired 
property. Then each witness provides original information. Of course, our 
witnesses are more significant than single finite paths, because they are 
sets of finite paths. This also gives us more accuracy than the approach 
with single finite paths, as a collection of finite paths behaving the same 
and reaching an undesired condition with high probability is more likely to 
show how the system reaches this condition than just a single path. Finally, 
because there is a finite number of rails, there is also a finite number of 
witnesses. 

Another key difference of our work with respect to previous ones is 
that our technique allows us to generate counterexamples for probabilistic 
systems with nondeterminism. However, an independent and concurrent 
study of counterexample generation for MDPs was carried out by Aljazzar 
and Leue [AL09]. There, the authors consider generating counterexamples 
for a fragment of pCTL, namely upper bounded formulas without nested 
temporal operators. The authors present three methods for generating 
counterexamples and study conditions under which these methods are suit- 
able. 

More recently, Schmalz et al. also investigated quantitative counterex- 
ample generation for LTL formulas [SVV09]. In qualitative probabilistic 
model checking, a counterexample is presented as a pair (a, 7), where a 
and 7 are finite words such that all paths that extend a and have infinitely 
many occurrences of 7 violate the property under consideration. In quan- 
titative probabilistic model checking, a counterexample is presented as a 
pair [W, R), where Vl^ is a set of such finite words a and R is a set of such 
finite words 7. 
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Similar SCC reduction techniques to the one presented in this paper 
have been studied for different purposes. In [1GM02], the authors focus on 
the problem of software testing. They use Markov chains to model soft- 
ware behaviour and SCC analysis to decompose the state space of large 
Markov chains. More recently, Abraham et al. presented a model checker 
for Markov chains based on the detection and abstraction of strongly con- 
nected components [AJW^IO]. Their algorithm has the advantage of of- 
fering abstract counterexamples, which can be interactively refined by the 
user. 

Finally, the problem of presenting counterexamples as single paths has 
also been observed by Han, Katoen, and Damman [DHK08, HKD09]. There, 
the authors propose to use regular expressions to group paths together. 
Thus, in the same way that we group together paths behaving the same 
outside SCC, they group together paths associated to the same regular ex- 
pression. 

For a more extensive survey on quantitative counterexample generation 
for (both discrete and continuous time) Markov chains we refer the reader 
to chapters 3, 4, and 5 of [Han09]. 



Chapter 6 



Interactive Systems and 
Equivalences for Security 



In this overview chapter we briefly discuss extensions to the 
frameworks presented in Chapters 3 and . First, we consider 
the case in which secrets and observables interact (in contrast 
with the situation in Chapter 3), and show that it is still possi- 
ble to define an information-theoretic notion of leakage, provided 
that we consider a more complex notion of channel, known in 
literature as channel with memory and feedback. Second, we 
extend the systems proposed in Chapter 4 by allowing nondeter- 
minism also internally to the components. Correspondingly, we 
define a richer notion of admissible scheduler suitable and we 
use it for defining notion of process equivalences relating to non- 
determinism in a more flexible way than the standard ones in 
the literature. In particular, we use these equivalences for defin- 
ing notions of anonymity robust with respect to implementation 
refinement. 



^For more information about tiie topics discussed in this chapter we refer the reader 
to [AAPlOa, AAPll, AAPlOb, AAPvRlO]. 
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6.1 Interactive Information Flow 

In this section we discuss the apphcabihty of the information-theoretic 
approach to interactive systems. These systems were already considered 
in [DJGP02]. In that paper the authors proposed to define the matrix 
elements P(6 | a) as the measure of the traces with (secret, observable)- 
projection (a, b), divided by the measure of the trace with secret projection 
a. This follows the definition of conditional probability in terms of joint 
and marginal probability. However, this approach does not lead to an 
information-theoretic channel. In fact, (by definition) a channel should be 
invariant with respect to the input distribution and such construction is 
not (as shown by Example 3.7.3). 

In [AAPlOa] and more recently in [AAPll], we consider an extension 
of the theory of channels which makes the information-theoretic approach 
applicable also the case of interactive systems. It turns out that a richer 
notion of channel, known in Information Theory as channels with memory 
and feedback, serves our purposes. The dependence of inputs on previous 
outputs corresponds to feedback, and the dependence of outputs on previ- 
ous inputs and outputs corresponds to memory. 

Let us explain more in detail the difference with the classical approach. 
In non-interactive systems, since the secrets always precede the observables, 
it is possible to group the sequence of secrets (and observables) in a single 
secret (respectively, observable) string. If we consider only one activation 
of the system, or if each use of the system is independent from the other, 
then we can model it as a discrete classical channel (memoryless, and with- 
out feedback) from a single input string to a single output string. When 
we have interactive systems, however, inputs and outputs may interleave 
and influence each other. Considering some sort of feedback in the channel 
is a way to capture this richer behavior. Secrets have a causal influence on 
observables via the channel, and, in the presence of interactivity, observ- 
ables have a causal influence on secrets via the feedback. This alternating 
mutual influence between inputs and outputs can be modeled by repeated 
uses of the channels. However, each time the channel is used it represents 
a different state of the computation, and the conditional probabilities of 
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observables on secrets can depend on this state. The addition of memory 
to the model allows expressing the dependency of the channel matrix on 
such a state (which, as we will see, can also be represented by the history 
of inputs and outputs). 

Recent results in Information Theory [TM09] have shown that, in chan- 
nels with memory and feedback, the transmission rate does not correspond 
to the maximum mutual information (capacity), but rather to the maxi- 
mum of the so-called directed information. Intuitively, this is due to the 
fact that mutual information expresses the correlation between the input 
and the output, and therefore it includes feedback. However, the feedback, 
i.e the way the output influences the next input, should not be considered 
part of the information transmitted. Directed information is essentially 
mutual information minus the dependence of the next input on previous 
output. We propose to adopt directed information and the corresponding 
notion of directed capacity to represent leakage. 

Our extension is a generalization of the classical model, in the sense 
that it can represent both interactive and non-interactive systems. One 
important feature of the classical approach is that the choice of secrets is 
seen as external to the system, i.e. determined by the environment. This 
implies that the probability distribution on the secrets (input distribution) 
constitutes the a priori knowledge and does not count as leakage. In order 
to encompass the classical approach, in our extended model we should 
preserve this principle, and the most natural way is to consider the secret 
choices, at every stage of the computation, as external. Their probability 
distributions, which are now in general conditional probability distributions 
(depending on the history of secrets and observables) should be considered 
as part of the external knowledge, and should not be counted as leakage. 

A second contribution of [AAPlOa] and [AAPll] is the proof that the 
channel capacity is a continuous function of the Kantorovich metric on in- 
teractive systems. This was pointed out also in [DJGP02], however their 
construction does not work in our case due to the fact (as far as we under- 
stand) it assumes that the probability of a secret action (in any point of 
the computation) is different from 0. This assumption is not guaranteed in 
our case and therefore we had to come out with a different reasoning. The 
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fact that our proof does not need this assumption shows that the intuition 
of [DJGP02] concerning the continuity of capacity is vahd in general. 

6.1.1 Applications 

Interactive systems can be found in a variety of disparate areas such as game 
theory, auction protocols, and zero-knowledge proofs. We now present two 
examples of interactive systems. 

• In the area of auction protocols, consider the cocaine auction protocol 
[S A99] . The auction is organized as a succession of rounds of bidding. 
Round i starts with the seller announcing the bid price 6j for that 
round. Buyers have t seconds to make an offer (i.e. to say yes, 
meaning "I am willing to buy at the current bid price As soon 
as one buyer says yes, he becomes the winner Wi of that round and 
a new round begins. If nobody says anything for t seconds, round i 
is concluded by timeout and the auction is won by the winner Wi-i 
of the previous round. The identities of the buyers in each round 
constitute the input of the channel, whereas the bid prices constitute 
the output of the channel. Note that inputs and outputs alternate 
so the system is interactive. It is also easy to see that inputs depend 
on past outputs (feedback): the identity of the winner of each round 
depends on the previous bid prices. Furthermore, outputs depend on 
the previous inputs (memory): (in some scenarios) the bid price of 
round i may depend on the identity of previous winners. For more 
details on the modeling of this protocol using channels with memory 
and feedback see [AAPll]. 

• In the area of game theory, consider the classic prisoner's dilemma 
(the present formulation is due to Albert W. Tucker [Pou92], but it 
was originally devised by Merrill Flood and Melvin Dresher in 1950). 
Two suspects are arrested by the police. The police have insufficient 
evidence for a conviction, and, having separated both prisoners, visit 
each of them to offer the same deal. If one testifies (defects from the 
other) for the prosecution against the other and the other remains 
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silent (cooperates with the other), the betrayer goes free and the silent 
accomplice receives the full 10-year sentence. If both remain silent, 
both prisoners are sentenced to only six months in jail for a minor 
charge. If each betrays the other, each receives a five-year sentence. 
Each prisoner must choose to betray the other or to remain silent. 
Each one is assured that the other would not know about the betrayal 
before the end of the investigation. In the iterated prisoner's dilemma, 
the game is played repeatedly. Thus each player has an opportunity 
to punish the other player for previous non-cooperative play. In this 
case the strategy (cooperate or defect) of each player is the input of 
the channel and the sentence is the output. Once again, it is easy 
to see that the system is interactive: inputs and outputs alternate. 
Furthermore, inputs depend on previous outputs (the strategy depend 
on the past sentences) and outputs depend on previous inputs (the 
sentence of the suspects depend on their declarations - cooperate or 
defect). 

6.2 Nondeterminism and Information Flow 

The noise of channel matrices, i.e. the similarity between the rows of 
the channel matrix, helps preventing the inference of the secret from the 
observables. In practice noise is created by using randomization, see for 
instance the DCNet [Cha88] and the Crowds [RR98] protocols. 

In the literature about the foundations of Computer Security, however, 
the quantitative aspects are often abstracted away, and probabilistic be- 
havior is replaced by nondeterministic behavior. Correspondingly, there 
have been various approaches in which information-hiding properties are 
expressed in terms of equivalences based on nondeterminism, especially in 
a concurrent setting. For instance, [SS96] defines anonymity as follows^: A 
protocol S is anonymous if, for every pair of culprits a and 6, S["'/x] and 
S[^/x] produce the same observable traces. A similar definition is given in 
[AG99] for secrecy, with the difference that S[°'/x] and S{^/x] are required to 

^The actual definition of [SS96] is more complicated, but the spirit is the same. 
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be bisimilar. In [DKR09], an electoral system S preserves the confidential- 
ity of the vote if for any voters v and w, the observable behavior of S is the 
same if we swap the votes of v and w. Namely, 5'['^/„ I'' /w] ~ I" f^], 

where ~ represents bisimilarity. 

These proposals are based on the implicit assumption that all the non- 
deterministic executions present in the specification of S will always be pos- 
sible under every implementation of S. Or at least, that the adversary will 
believe so. In concurrency, however, as argued in [CNP09], nondeterminism 
has a rather different meaning: if a specification S contains some nonde- 
terministic alternatives, typically it is because we want to abstract from 
specific implementations, such as the scheduling policy. A specification is 
considered correct, with respect to some property, if every alternative satis- 
fies the property. Correspondingly, an implementation is considered correct 
if all executions are among those possible in the specification, i.e. if the 
implementation is a refinement of the specification. There is no expecta- 
tion that the implementation will actually make possible all the alternatives 
indicated by the specification. 

We argue that the use of nondeterminism in concurrency corresponds 
to a demonic view: the scheduler, i.e. the entity that will decide which 
alternative to select, may try to choose the worst alternative. Hence we 
need to make sure that "all alternatives are good" , i.e. satisfy the intended 
property. In the above mentioned approaches to the formalization of se- 
curity properties, on the contrary, the interpretation of nondeterminism is 
angelic: the scheduler is expected to actually help the protocol to confuse 
the adversary and thus protect the secret information. 

There is another issue, orthogonal to the angelic/demonic dichotomy, 
but relevant for the achievement of security properties: the scheduler should 
not be able to make its choices dependent on the secret, or else nearly every 
protocol would be insecure, i.e. the scheduler would always be able to leak 
the secret to an external observer (for instance by producing different inter- 
leavings of the observables, depending on the secret). This remark has been 
made several times already, and several approaches have been proposed to 
cope with the problem of the "almighty" scheduler (aka omniscient, clair- 
voyant, etc.), see for example [CCK+06a, GD07, CNP09, APvRSll, CPIO]. 
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The risk of a naive use of nondeterminism to specify a security property, 
is not only that it may rely on an implicit assumption that the scheduler 
behaves angelically, but also that it is clairvoyant, i.e. that it peeks at the 
secrets (that it is not supposed to be able to see) to achieve its angelic 
strategy. 

Consider the following system, in a CCS-like syntax: 

S=ic,out){A II Corr \\ Hi \\ H2), 

with A c{sec) ,Corr c{s).out{s), Hi =^ c{s).out{a), H2 '= c{s).out{b) 
and where || is the parallel operator, c{sec) is a process that sends sec on 
channel c, c{s).P is a process that receives s on channel c and then continues 
as P, and (c, out) is the restriction operator, enforcing synchronization on 
c and out. In this example, sec represents a secret information. 

It is easy to see that we have S [""/sec] ~ -S* [^/sec] • Note that, in order to 
simulate the third branch in S [""/sec], the process S [''/sec] needs to select 
its first branch. Viceversa, in order to simulate the third branch in S [^ / sec\ : 
the process S [^/sec] needs to select its second branch. This means that, in 
order to achieve bisimulation, the scheduler needs to know the secret, and 
change its choice accordingly. 

This example shows a system that intuitively is not secure, because 
the third component, Corr, reveals whatever secret it receives. However, 
according to the equivalence-based notions of security discussed above, it 
is secure. But it is secure thanks to a scheduler that angelically helps 
the system to protect the secret, and it does so by making its choices 
dependent on the secret! In our opinion these assumptions on the scheduler 
are excessively strong. 

In a recent work [AAPvRlO] we address the above issue by defining 
a framework in which it is possible to combine both angelic and demonic 
nondeterminism in a setting in which also probabilistic behavior may be 
present, and in a context in which the scheduler is restricted (i.e. not 
clairvoyant). We propose safe versions of typical equivalence relations 
(traces and bisimulation), and we show how to use them to characterize 
information-hiding properties. 



Chapter 7 

Conclusion 



In this chapter we summarize the main contributions of this 
thesis and discuss further directions. 

7.1 Contributions 

The goal of this thesis is to develop a formal framework for specifying, ana- 
lyzing and verifying anonymity protocols and, more in general, information 
hiding protocols. 

As discussed in the Introduction, conditional probabilities are a key 
concept in assessing the degree of information protection. In Chapter 2, we 
have extended the probabilistic temporal logic pCTL to cpCTL, in which 
it is possible to express conditional probabilities. We have also proved 
that optimal scheduling decisions can always be reached by a deterministic 
and semi history-independent scheduler. This fundamental result, allowed 
us to define an algorithm to verify cpCTL formulas. Our algorithm first 
reduces the MDP to an acyclic MDP and then computes optimal conditional 
probabilities in the acyclic MDP. In addition, we have defined a notion 
of counterexample for conditional formulas and sketched an algorithm for 
counterexample generation. 

We then turned our attention to more practical grounds. In Chapter 
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3, we have addressed the problem of computing the information leakage of 
a system in an efficient way. We have proposed two methods: one based 
on reachability techniques and the other based on quantitative counterex- 
ample generation. In addition, we have shown that when the automaton 
is interactive it is not possible to define its channel in the standard way. 
An intriguing problem is how to extend the notion of channel so to capture 
the dynamic nature of interaction. In Chapter 6 we have briefiy discussed 
how to solve this problem by using more complex information theoretic 
channels, namely channels with history and feedback. 

In Chapter 4, we have attacked a well known problem of concurrent 
information-hiding protocols, namely full-information scheduling. To over- 
come this problem, we have defined a class of partial-information schedulers 
which can only base their decisions on the information that they have avail- 
able. In particular they cannot base their decisions on the internal behavior 
of the components. We have used admissible schedulers to resolve nonde- 
terminism in a realistic way, and to revise some anonymity definitions from 
the literature. In addition, we have presented a technique to prove the var- 
ious definitions of anonymity proposed in the chapter. This is particularly 
interesting considering that many problems related to restricted schedulers 
have been shown to be undecidable. We have illustrated the applicability of 
our proof technique by proving that the well-known DC protocol is anony- 
mous when considering admissible schedulers, in contrast to the situation 
when considering full-information schedulers. 

The last major contribution of this thesis is a novel technique for rep- 
resenting and computing counterexamples for nondeterministic and prob- 
abilistic systems. In Chapter 5, we have shown how to carefully partition 
a counterexample in sets of paths. These sets are intended to provide in- 
formation related to the violation of the property under consideration, so 
we call them witnesses. Five properties that witnesses should satisfy (in 
order to provide significant debugging information) are identified in this 
chapter. The key contribution of this chapter is a technique based on 
strongly connected component analysis that makes it possible to partition 
counterexamples into witnesses satisfying the desired properties. 
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7.2 Further directions 

There are several ways of extending the work presented in this thesis. 

As we have shown in Chapter 2, the most important issue when com- 
puting conditional probabilities is that optimizing schedulers are not de- 
termined by the local structure of the system. As a consequence, it is not 
possible to reduce the problem of verifying cpCTL to a linear optimization 
problem (as it is the case with pCTL). A natural question arising from this 
observation, is whether the problem of model checking conditional proba- 
bilities is inherently exponential or not. We believe that it is; however we 
are of the idea that it is also possible to find suitable restrictions (either 
to the formulas or to the systems under consideration) that would make it 
possible to model check conditional probabilities in polynomial time. 

In a more practical matter, counterexample generation for probabilistic 
model checking is nowadays a very hot topic for which several applications 
in the most diverse areas have been identified. During the last few years, 
many techniques have been proposed for different flavours of logics and 
models. However, to the best of our knowledge, no practical tool to au- 
tomatically generate quantitative counterexamples has been implemented. 
We believe that such a practical tool could be a significant contribution 
to the field. More concretely, we believe that a tool implementing the 
regular-expression and k-shortest path techniques introduced by Han et al. 
in combination with the SCC analysis techniques presented in this thesis 
would be of great value. 

In Chapter 2, we have made a connection between quantitative coun- 
terexample generation and information leakage computation. Thanks to 
this connection, such a tool would also allow us to compute / approximate 
leakage of large scale protocols. Furthermore, it would make it possible to 
investigate in more depth how the debugging information provided by the 
tool can be used to identify flaws of the protocol causing high leakage. 

Finally, as for most definitions of partial-information schedulers from 
the literature, our notions of admissible schedulers may raise undecidability 
issues. Thus, it would be interesting to investigate whether the notions of 
anonymity proposed in Chapter 4 are actually verifiable (remember that 
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the proof technique we proposed is sufficient but not necessary). Another 
interesting direction for future work is to adapt well known isomorphism- 
checking algorithms and tolls to our setting in order to automatically verify 
some anonymity properties. 
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Samenvatting 



Terwijl we het digitale tijdperk ingaan zijn er immer groeiende zorgen 
over de hoeveelheid digitale data die over ons verzameld wordt. Websites 
houden vaak het browse-gedrag van mensen bij, ziektenkostenverzekeraars 
verzamelen medische gegegevens en smartphones en navigatiesystemen ver- 
sturen informatie die het mogehjk maakt de fysieke locatie van hun gebruik- 
ers te bepalen. Hierdoor staan anonimiteit, en privacy in het algemeen, 
steeds meer op het spel. Anonimiteitsprotocollen proberen iets tegen deze 
tendens te doen door anonieme communicatie over het Internet mogehjk 
te maken. Om de correctheid van dergehjke protocohen, die vaak extreem 
complex zijn, te garanderen, is een degelijk framework vereist waarin anon- 
imiteitseigenschappen kunnen worden uitgedrukt en geanalyseerd. Formele 
methoden voorzien in een verzameling wiskundige technieken die het mo- 
gehjk maken anonimiteitseigenschappen rigoreus te specificeren en te ver- 
ifieren. 

Dit proefschrift gaat over de grondslagen van formele methoden voor 
toepassingen in computerbeveiliging en in het bijzonder anonimiteit. Con- 
creet, we ontwikkelen frameworks om anonimiteitseigenschappen te specifi- 
ceren en algoritmen om ze te verifieren. Omdat in de praktijk anonimiteit- 
sprotocollen altijd wat informatie lekken, leggen we de focus op quanti- 
tatieve eigenschappen die de mate van gelekte informatie van een protocol 
beschrijven. 

We beginnen het onderzoek naar anonimiteit vanuit de basis, namelijk 
voorwaardelijke kansen. Dit zijn de sleutelingredienten van de meeste quan- 
titatieve anonimiteitsprotocollen. In Hoofdstuk 2 prenteren we cpCTL, 
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de eerste temporele logica waarin voorwaardelijke kansen kunnen worden 
uitgedrukt. We presenteren ook een algoritme om cpCTL formules te ver- 
ifieren met een modelchecker. Samen met een modelchecker maakt deze 
logica het mogelijk om quantitatieve anomimiteitseigenschappen van com- 
plexe systemen waarin zowel probabilistisch als nondeterministisch gedrag 
voorkomt te specificeren en verifieren. 

Vervolgens gaan we meer de praktijk in: de constructie van algorit- 
men die de mate van het lekken van informatie meten. Om preciezer te 
zijn, Hoofdstuk 3 beschrijft polynomiale algoritmen om de (informatie- 
theoretische) information leakage te quantificeren voor verscheidene soorten 
volledig probabilistische protocllen (d.w.z., protocollen zonder nondeter- 
ministisch gedrag). The technieken uit dit hoofdstuk zijn de eerste die 
het mogelijk maken de informatie leakage voor interactieve protocollen te 
berekenen. 

In Hoofdstuk 4 behandelen we een bekend probleem in gedistribueerde 
anonimiteitsprotocollen, namelijk schedulers met volledige informatie. Om 
dit probleem op te lossen stellen we een alternatieve definitie van sched- 
uler voor, samen met nieuwe definities voor anonomiteit (varierend met 
de capaciteiten van de aanvaller) en herzien de bekende definitie van sterke 
anonimiteit uit de literatuur. Bovendien laten we een techniek zien waarmee 
gecontroleerd kan worden of een gedistribueerd protocol aan enkele van deze 
definities voldoet. 

In Hoofdstuk 5 laten we op tegenvoorbeelden gebaseerde technieken zien 
die het mogelijk maken complexe systemen te debuggen. Dit maakt het mo- 
gelijk fouten in security protocollen op te sporen. Tenslotte, in Hoofdstuk 
6, beschrijven we kort uitbreidingen van de frameworks en technieken uit 
Hoofdstukken 3 en 4. 
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